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Abstract — 

The history of the theory and practice of quantization 
dates to 1948, although similar ideas had appeared in the 
literature as long ago as 1898. The fundamental role of quan- 
tization in modulation and analog-to-digital conversion was 
first recognized during the early development of pulse code 
modulation systems, especially in the 1948 paper of Oliver, 
Pierce, and Shannon. Also in 1948, Bennett published the 
first high-resolution analysis of quantization and an exact 
analysis of quantization noise for Gaussian processes, and 
Shannon published the beginnings of rate distortion theory, 
which would provide a theory for quantization as analog-to- 
digital conversion and as data compression. Beginning with 
these three papers of fifty years ago, we trace the history 
of quantization from its origins through this decade, and 
we survey the fundamentals of the theory and many of the 
popular and promising techniques for quantization. 

Keywords — Quantization, source coding, rate distortion 
theory, high resolution theory 



I. INTRODUCTION 

THE dictionary (Random House) definition of quantiza- 
tion is the division of a quantity into a discrete number 
of small parts, often assumed to be integral multiples of a 
common quantity. The oldest example of quantization is 
rounding off, which was first analyzed by Shcppard [468] for 
the application of estimating densities by histograms. Any 
real number x can be rounded off to the nearest integer, 
say q(x), with a resulting quantization error c = q(x) — x so 
that q(x) = x+c. More generally, we can define a quantizer 
as consisting of a set of intervals or cells S = {Si; i G I), 
where the index set 2 is ordinarily a collection of consec- 
utive integers beginning with 0 or 1, together with a set 
of reproduction values or points or levels C = {t/ t -; t £ J}, 
so that the overall quantizer q is defined by q(x) = yi for 
x £ S{ , which can be expressed concisely as 

?(*) = £s«is.-(z), (i) 

t 

where the indicator function ls(#) is 1 if x £ S and 0 
otherwise. For this definition to make sense we assume 
that S is a partition of the real line. That is, the cells 
arc disjoint and exhaustive. The general definition reduces 
to the rounding off example if Si = (i — 1/2, i + 1/2] and 
t/i = t for all integers t. More generally the cells might take 
the form Si = (a t _i,a t ] where the a,'s, which arc called 
thresholds ) form an increasing sequence. The width of a 
cell Si is its length, a t - - a,*_i. The function q(x) is often 
called the quantization rule. A simple quantizer with 5 
reproduction levels is depicted in Figure 1 as a collection 
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of intervals bordered by thresholds along with the levels for 
each interval. 
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Fig. 1. A nonuniform Quantizer: ao = —00, as = 00 

A quantizer is said to be uniform if, as in the roundoff 
case, the levels t/ t - arc cquispaccd, say A apart, and the 
thresholds a,i arc midway between adjacent levels. If an 
infinite number of levels arc allowed, then all cells Si will 
have width equal to A, the separation between levels. If 
only a finite number of levels arc allowed, then all but two 
cells will have width A and the outermost cells will be semi- 
infinite. An example of a uniform quantizer with cell width 
A and N = 8 levels is given in Figure 2. Given a uni- 

— $A —1A _3A _A A 3A 5A 7A 9A 

2222222222 

*— I — *— I — » I » I — *H — — «H — — *H — * -»» x 

-4A-3A-2A -AO A 2A 3A 4A 
Fig. 2. A Uniform Quantizer 

form quantizer with cell width A, the region of the input 
space within A/2 of some quantizer level is called the gran- 
ular region or simply the support and that outside (where 
the quantizer error is unbounded) is called the overload or 
saturation region. More generally, the support or granular 
region of a nonuniform quantizer is the region of the input 
space within a relatively small distance of some level, and 
the overload region is the complement of the granular re- 
gion. To be concrete, "small" might be defined as half the 
width of the largest cell of finite width. 

The quality of a quantizer can be measured by the good- 
ness of the resulting reproduction in comparison to the 
original. One way of accomplishing this is to define a 
distortion measure d(x, x) that quantifies cost or distor- 
tion resulting from reproducing x as x and to consider the 
average distortion as a measure of the quality of a sys- 
tem, with smaller average distortion meaning higher qual- 
ity. The most common distortion measure is the squared 
error d(x, x) = |a:- ^| 2 , but we shall encounter others later. 
In practice the average will be a sample average when the 
quantizer is applied to a sequence of real data, but the the- 
ory views the data as sharing a common probability den- 
sity function (pdf) f(x) corresponding to a generic random 
variable X and the average distortion becomes an expec- 
tation 

D{q) = E[d(X, q(X))) = J2f d ( x ' dx. (2) 

i J St 
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If the distortion is measured by squared error, D(q) be- 
comes the mean squared error (MSE), a special case on 
which we shall mostly focus. 

It is desirable to have the average distortion as small as 
possible, and in fact negligible average distortion is achiev- 
able by letting the cells become numerous and tiny. There 
is a cost in terms of the number of bits required to describe 
the quantizer output to a decoder, however, and arbitrarily 
reliable reproduction will not be possible for digital stor- 
age and communication media with finite capacity. A sim- 
ple method for quantifying the cost for communications 
or storage is to assume that the quantizer "codes" an in- 
put x, into a binary representation or channel codeword of 
the quantizer index i specifying which reproduction level 
should be used in the reconstruction. If there arc N pos- 
sible levels and all of the binary representations or binary 
codewords have equal length (a temporary assumption), 
the binary vectors will need log N (or the next larger inte- 
ger, [log TV] , if log TV is not an integer) components or bits. 
Thus one definition of the rate of the code in bits per input 
sample is 

= log (3) 

A quantizer with fixed-length binary codewords is said to 
have fixed-rate because all quantizer levels arc assumed to 
have binary codewords of equal length. Later this restric- 
tion, will be weakened. Note that all logarithms in this 
paper will have base 2, unless explicitly specified otherwise. 

In summary, the goal of quantization is to encode the 
data from a source, characterized by its probability den- 
sity function, into as few bits as possible (i.e. with low 
rate) in such a way that a reproduction may be recovered 
from the bits with as high quality as possible (i.e. with 
small average distortion). Clearly, there is a tradeoff be- 
tween the two primary performance measures: average dis- 
tortion (or simply distortion, as we will often abbreviate) 
and rate. This tradeoff may be quantified as the opera- 
tional distortion-rate function 6(R), which is defined to be 
the least distortion of any scalar quantizer with rate R or 
less. That is, 

6(R)= inf D(q). (4) 

q:R(q)<n 

Alternatively, one can define the operational rate- distortion 
function r(D) as the least rate of any fixed-rate scalar 
quantizer with distortion D or less, which is the inverse 
of 6(R). 

We have so far described scalar quantization with fixed- 
rate coding, a technique whereby each data sample is in- 
dependently encoded into a fixed number of bits and de- 
coded into a reproduction. As we shall sec, there arc many 
alternative quantization techniques that permit a better 
tradeoff of distortion and rate; e.g. less distortion for the 
same rate, or vice versa. The purpose of this paper is to 
review the development of such techniques, and the theory 
of their design and performance. For example, for each 
type of technique we will be interested in its operational 
distortion- rate function, which is defined to be the least 
distortion of any quantizer of the given type with rate R 



or less. We will also be interested in the best possible per- 
formance among all quantizers. Both as a preview and 
as an occasional benchmark for comparison, we informally 
define the class of all quantizers as the class of quantizers 
that can (1) operate on scalars or vectors instead of only 
on scalars (vector quantizers), (2) have fixed or variable 
rate in the sense that the binary codeword describing the 
quantizer output can have length depending on the input, 
and (3) be mcmorylcss or have memory, for example using 
different sets of reproduction levels, depending on the past. 
In addition, we restrict attention to quantizers that do not 
change with time. That is, when confronted with the same 
input and the same past history, a quantizer will produce 
the same output regardless of the time. We occasionally use 
the term lossy source code or simply code as alternatives to 
quantizer, The rate is now defined as the average number 
of bits per source symbol required to describe the corre- 
sponding reproduction symbol. We informally generalize 
the operational distortion-rate function S(R) providing the 
best performance for scalar quantizers, to S(R) ) which is 
defined as the infimum of the average distortion over all 
quantization techniques with rate R or less. Thus S(R) 
can be viewed as the best possible performance over all 
quantizers with no constraints on dimension, structure, or 
complexity. 

Section II begins with an historical tour of the develop- 
ment of the theory and practice of quantization over the 
past fifty years, a period encompassing almost the entire 
literature on the subject. Two complementary approaches 
dominate the history and present state of the theory, and 
three of the key papers appeared in 1948, two of them in 
Volume 27 (1948) of the Bell Systems Technical Journal. 
Likely the approach best known to the readers of these 
Transactions is that of rate distortion theory or source 
coding with a fidelity criterion — Shannon's information 
theoretic approach to source coding — which was first sug- 
gested in his 1948 paper [464] providing the foundations of 
information theory, but which was not fully developed until 
his 1959 source coding paper [465]. The second approach is 
that of high resolution (or high rate or asymptotic) quan- 
tization theory, which had its origins in the 1948 paper on 
PCM by Oliver, Pierce and Shannon [394], the 1948 paper 
on quantization error spectra by Bennett [43], and the 1951 
paper by Pantcr and Ditc [405]. Much of the history and 
state of the art of quantization derives from these seminal 
works. 

In contrast to these two asymptotic theories, there is 
also a small but important collection of results that arc not 
asymptotic in nature. The oldest such results arc the exact 
analyses for special non asymptotic cases, such as Clavier, 
Pantcr, and Grieg's 1947 analysis of the spectra of the 
quantization error for uniformly quantized sinusoidal sig- 
nals [99], [100] and Bennett's 1948 derivation of the power 
spectral density of a uniformly quantized Gaussian random 
process [43]. The most important nonasymptotic results, 
however, arc the basic optimality conditions and iterative 
descent algorithms for quantizer design, such as first de- 
veloped by Stcinhaus (1956) [480] and Lloyd (1957) [330], 
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and later popularized by Max (1960) [349]. 

Our goal in the next section is to introduce in histor- 
ical context many of the key ideas of quantization that 
originated in classical works and evolved over the past 50 
years, and in the remaining sections to survey selectively 
and in more detail a variety of results which illustrate both 
the historical development and the state of the field. Sec- 
tion III will present basic background material that will be 
needed in the remainder of the paper, including the general 
definition of a quantizer and the basic forms of optimality 
criteria and descent algorithms. Some such material has 
already been introduced and more will be introduced in 
Section II. However, for completeness, Section III will be 
largely self-contained. Section IV reviews the development 
of quantization theories and compares the approaches. Fi- 
nally, Section V describes a number of specific quantization 
techniques. 

In any review of a large subject such as quantization 
there is not space to discuss or even mention all work on the 
subject. Though we have made an effort to select the most 
important work, no doubt we have missed some important 
work due to bias, misunderstanding, or ignorance. For this 
we apologize, both to the reader and to the researchers 
whose work we may have neglected. 

II. History 

The history of quantization often takes on several paral- 
lel paths, which causes some problems in our clustering of 
topics. We follow roughly a chronological order within each 
and order the paths as best we can. Specifically, we will 
first track the design and analysis of practical quantization 
techniques in three paths: fixed-rate scalar quantization, 
which leads directly from the discussion of Section I, predic- 
tive and transform coding, which adds linear processing to 
scalar quantization in order to exploit source redundancy, 
and variable-rate quantization, which uses Shannon's loss- 
less source coding techniques [464] to reduce rate. (Loss- 
less codes were originally called noiseless.) Next we follow 
early forward looking work on vector quantization, includ- 
ing the seminal work of Shannon and Zador, in which vector 
quantization appears more to be a paradigm for analyzing 
the fundamental limits of quantizer performance than a 
practical coding technique. A surprising amount of such 
vector quantization theory was developed outside the con- 
ventional communications and signal processing literature. 
Subsequently, we review briefly the developments from the 
mid 1970's to the mid 1980's which mainly concern the 
emergence of vector quantization as a practical technique. 
Finally, we sketch briefly developments from the mid 1980's 
to the present. Except where stated otherwise, we presume 
squared error as the distortion measure. 

A. Fixed- Rate Scalar Quantization: PCM and the Origins 
of Quantization Theory 

Both quantization and source coding with a fidelity cri- 
terion have their origins in pulse code modulation (PCM), 
a technique patented in 1938 by Reeves [432], who 25 years 
later wrote an historical perspective on and an appraisal of 



the future of PCM with Dclorainc [120]. The predictions 
were surprisingly accurate as to the eventual ubiquity of 
digital speech and video. The technique was first success- 
fully implemented in hardware by Black, who reported the 
principles and implementation in 1947 [51], as did another 
Bell Labs paper by Goodall [209]. PCM was subsequently 
analyzed in detail and popularized by Oliver, Pierce, and 
Shannon in 1948 [394]. PCM was the first digital tech- 
nique for conveying an analog information signal (princi- 
pally telephone speech) over an analog channel (typically, 
a wire or the atmosphere). In other words it is a modula- 
tion technique, i.e., an alternative to AM, FM and various 
other types of pulse modulation. It consists of three main 
components: a sampler (including a prcfiltcr), a quantizer 
(with a fixed-rate binary encoder) and a binary pulse mod- 
ulator. The sampler converts a continuous-time waveform 
x(t) into a sequence of samples x n = x(n//,), where f s 
is the sampling frequency. The sampler is ordinarily pre- 
ceded by a lowpass filter with cutoff frequency / 5 /2. If 
the filter is ideal, then the Shannon- Nyquist or Shannon- 
Whittakcr-Kotclnikov sampling theorem ensures that the 
lowpass filtered signal can, in principle, be perfectly re- 
covered by appropriately filtering the samples. Quantiza- 
tion of the samples renders this an approximation, with 
the MSE of the recovered waveform being, approximately, 
the sum of the MSE of the quantizer, D(q) 7 and the high 
frequency power removed by the lowpass filter. The binary 
pulse modulator typically uses the bits produced by the 
quantizer to determine the amplitude, frequency or phase 
of a sinusoidal carrier waveform. In the evolutionary de- 
velopment of modulation techniques it was found that the 
performance of pulse amplitude modulation in the presence 
of noise could be improved if the samples were quantized 
to the nearest of a set of N levels before modulating the 
carrier (64 equally spaced levels was typical). Though this 
introduces quantization error, deciding which of the N lev- 
els had been transmitted in the presence of noise could be 
done with such reliability that the overall MSE was sub- 
stantially reduced. Reducing the number of quantization 
levels N made it even easier to decide which level had been 
transmitted, but came at the cost of a considerable increase 
in the MSE of the quantizer. A solution was to fix N at a 
value giving acceptably small quantizer MSE and to binary 
encode the levels, so that the receiver had only to make 
binary decisions, something it can do with great reliabil- 
ity. The resulting system, PCM, had the best resistance to 
noise of all modulations of the time. 

As the digital era emerged, it was recognized that the 
sampling, quantizing, and encoding part of PCM performs 
an analog-to-digital (A/D) conversion, with uses extending 
much beyond communication over analog channels. Even 
in the communications field, it was recognized that the task 
of analog-to-digital conversion (and source coding) should 
be factored out of binary modulation as a separate task. 
Thus, PCM is now generally considered to just consist of 
sampling, quantizing and encoding; i.e., it no longer in- 
cludes the binary pulse modulation. 

Although quantization in the information theory litcra- 
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turc is generally considered as a form of data compression , 
its use for modulation or A/D conversion was originally 
viewed as data expansion or, more accurately, bandwidth 
expansion. For example, a speech waveform occupying 
roughly 4kHz would have a Nyquist rate of 8kHz. Sam- 
pling at the Nyquist rate and quantizing at 8 bits per sam- 
ple and then modulating the resulting binary pulses using 
amplitude or frequency shift keying would yield a signal 
occupying roughly 64kHz, a 16 fold increase in bandwidth! 
Mathematically this constitutes compression in the sense 
that a continuous waveform requiring an infinite number 
of bits is reduced to a finite number of bits, but for practi- 
cal purposes PCM is not well interpreted as a compression 
scheme. 

In an early contribution to the theory of quantization, 
Clavier, Pantcr, and Grieg (1947) [99], [100] applied Rice's 
characteristic function or transform method [434] to pro- 
vide exact expressions for the quantization error and its 
moments resulting from uniform quantization for certain 
specific inputs, including constants and sinusoids. The 
complicated sums of Bcsscl functions resembled the early 
analyses of another nonlinear modulation technique, FM, 
and left little hope for general closed form solutions for 
interesting signals. 

The first general contributions to quantization theory 
came in 1948 with the papers of Oliver, Pierce, and Shan- 
non [394] and Bennett [43]. As part of their analysis of 
PCM for communications, they developed the oft-quoted 
result that for large rate or resolution, a uniform quantizer 
with cell width A yields average distortion D(q) S A 2 /12. 
If the quantizer has N levels and rate R — log TV, and the 
source has input range (or support) of width A, so that 
A = A/N is the natural choice, then the A 2 / 12 approxi- 
mation yields the familiar form for the signal-to-noisc ratio 
(SNR)of 

£ c 4- 6RdB 

showing that for large rate, the SNR of uniform quantiza- 
tion increases 6 dB for each one bit increase of rate, which 
is often referred to as the "6 dB per bit rule". The A 2 /12 
formula is considered a high-resolution formula, indeed the 
first such formula, in that it applies to the situation where 
the cells and average distortion arc small, and the rate is 
large, so that the reproduction produced by the quantizer 
is quite accurate. The A 2 /12 result also appeared many 
years earlier (albeit in somewhat disguised form) in Shcp- 
pard's 1898 treatment [468]. 

Bennett also developed several other fundamental results 
in quantization theory. He generalized the high- resolution 
approximation for uniform quantization to provide an ap- 
proximation to D(q) for companders, systems that pre- 
ceded a uniform quantizer by a monotonic smooth non- 
linearity called a "compressor," say G, and used the in- 
verse nonlincarity when reconstructing the signal. Thus 
the output reproduction x given an input x was given by 



x = G l (q(G(x)) ) where q is a uniform quantizer. Bennett 
showed that in this case 

where g(x) = dG(x)/dx } A is the cell width of the uniform 
quantizer, and the integral is taken over the granular range 
of the input. (The constant 1/12 in the above assumes 
that G maps to the unit interval [0,1].) Since, as Bennett 
pointed out, any nonuniform quantizer can be implemented 
as a compander, this result, often referred to as "Bennett's 
integral," provides an asymptotic approximation for any 
quantizer. It is useful to jump ahead and point out that g 
can be interpreted, as Lloyd would explicitly point out in 
1957 [330], as a constant times a "quantizer point-density 
function X(x)" that is, a function with the property that 
for any region S 

number of quantizer levels in S « TV / X(x) dx. (6) 

Js 

Since integrating X(x) over a region gives the fraction of 
quantizer reproduction levels in the region, it is evident 
that X(x) is normalized so that f m X(x) dx = 1. It will also> 
prove useful to consider the unnormalizcd quantizer point. 
, density A(x), which when integrated over S gives the total 
number of levels within S rather than the fraction. In 
the current situation A(x) = NX(x), but the unnormalizcd 
density will generalize to the case where N is infinite. 

Rewriting Bennett's integral in terms of the point- 
density function yields its more common form 

The idea of a quantizer point-density function will gener- 
alize to vectors, while the compander approach will not in 
the sense that not all vector quantizers can be represented 
as companders [192]. 

Bennett also demonstrated that, under assumptions of 
high resolution and smooth densities, the quantization 
error behaved much like random "noise": it had small 
correlation with the signal and had approximately a flat 
("white") spectrum. This led to an "additive noise" model 
of quantizer error, since with these properties the formula 
q(X) = X + [q(X) — X] could be interpreted as represent- 
ing the quantizer output as the sum of a signal and white 
noise. This model was later popularized by Widrow [528], 
[529], but the viewpoint avoids the fact that the "noise" 
is in fact dependent on the signal and the approximations 
arc valid only under certain conditions. Signal independent 
quantization noise has generally been found to be percep- 
tually desirable. This was the motivation for randomizing 
the action of quantization by the addition of a dither signal, 
a method introduced by Roberts [442] as a means of mak- 
ing quantized images look better by replacing the artifacts 
resulting from deterministic errors by random noise. We 
shall return to dithering in Section V, where it will be seen 
that suitable dithering can indeed make exact the Bennett 
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approximations of uniform distribution and signal indepen- 
dence of the overall quantizer noise. Bennett also used a 
variation of Rice's method to derive an exact computation 
of the spectrum of quantizer noise when a Gaussian process 
is uniformly quantized, providing one of the very few exact 
computations of quantization error spectra. 

In 1951 Pantcr and Ditc [405] developed a high- 
resolution formula for the distortion of a fixed-rate scalar 
quantizer using approximations similar to Bennett's, but 
without reference to Bennett. They then used variational 
techniques to minimize their formula and found the follow- 
ing formula for the operational distortion-rate function of 
fixed-rate scalar quantization: for large values of ii, 

*(*) = ^(//*(*)^) 3 2- 2 «, (8) 

which is now called the Pantcr and Ditc formula. 1 As 
part of their derivation, they demonstrated that an op- 
timal quantizer resulted in roughly equal contributions to 
total average distortion from each quantization cell, a re- 
sult later called the "partial distortion theorem." Though 
they did not redcrive Bennett's integral, they had in effect 
derived the optimal compressor function for a compander, 
or, cquivalcntly, the optimal quantizer point density: 

Indeed, substituting this point density into Bennett's inte- 
gral and using the fact that R = logN yields (8). As an 
example, if the input density is Gaussian with variance (r 2 , 
then 

6(R) 2 ^6W3<r 2 2- 2 *. (10) 

The fact that for large rates 6(R) decreases with R as 2~ 2R 
implies that the signal- to-noisc ratio increases according to 
the 6 dB per bit rule. Virtually all other high resolution 
formulas to be given later will also obey this rule. However, 
the constant that adds to 6R will vary with the source and 
quantizer being considered. 

The Pantcr-Ditc formula for S(R) can also be derived 
directly from Bennett's integral using variational meth- 
ods, as did Lloyd (1957) [330], Smith (1957) [474] and, 
much later without apparent knowledge of earlier work, 
Roc (1964) [443]. It can also be derived without using 
variational methods by application of Holder's inequality 
to Bennett's integral [222], with the additional benefit of 
demonstrating that the claimed minimum is indeed global. 
Though not known at the time, it turns out that for a Gaus- 
sian source with independent and identically distributed 
(i.i.d.) samples, the operational distortion- rate function 
given above is ny/3/2 = 2.72 times larger than 5(A), the 
least distortion achievable by any quantization technique 
with rate R orlcss. (It was not until Shannon's 1959 pa- 
per [465] that S(R) was known.) Equivalcntly, the induced 

1 They also indicated that it had been derived earlier by P.R. 
Aigrain. 



signal-to-noisc ratio is 4.35 dB less than the best possible, 
or for a fixed distortion D the rate is .72 bits/sample larger 
than that achievable by the best quantizers. 

In 1957 Smith [474] reexamined companding and PCM. 
Among other things, he gave somewhat cleaner derivations 
of Bennett's integral, the optimal compressor function, and 
the Pantcr-Ditc formula. 

Also in 1957, Lloyd [330] made an important study of 
quantization with three main contributions. First, he found 
necessary and sufficient conditions for a fixed-rate quan- 
tizer to be locally optimal; i.e. conditions that if satisfied 
implied that small perturbations to the levels or thresh- 
olds would increase distortion. Any optimal quantizer (one 
with smallest distortion) will necessarily satisfy these con- 
ditions, and so they arc often called the optimality condi- 
tions or the necessary conditions. Simply stated, Lloyd's 
optimality conditions arc that for a fixed-rate quantizer to 
be optimal, the quantizer partition must be optimal for the 
set of reproduction levels, and the set of reproduction lev- 
els must be optimal for the partition.. Lloyd derived these 
conditions straightforwardly from first principles, without 
recourse to variational concepts such as derivatives. For 
the case of mean squared error, the first condition im- 
plies a minimum distance or nearest neighbor quantization 
rule, choosing the closest available reproduction level to 
the source sample being quantized, and the second condi- 
tion implies that the reproduction level corresponding to a 
given cell is the conditional expectation or centroid of the 
source value given that it lies in the specified cell; i.e., it 
is the minimum mean squared error estimate of the source 
sample. For some sources there arc multiple locally optimal 
quantizers, not all of which arc globally optimal. 

Second, based on his optimality conditions, Lloyd devel- 
oped an iterative descent algorithm for designing quantiz- 
ers for a given source distribution: begin with an initial 
collection of reproduction levels, optimize the partition for 
these levels by using a minimum distortion mapping, which 
gives a partition of the real line into intervals, then opti- 
mize the set of levels for the partition by replacing the 
old levels by the ccntroids of the partition cells. The al- 
ternation is continued until convergence to a local, if not 
global, optimum. Lloyd referred to this design algorithm 
as "Method I." He also developed a Method II based on 
the optimality properties. First choose an initial smallest 
reproduction level. This determines the cell threshold to 
the right, which in turn implies the next larger reproduc- 
tion level, and so on. This approach alternately produces a 
level and a threshold. Once the last level has been chosen, 
the initial level can then be rcchoscn to reduce distortion 
and the algorithm continues. Lloyd provided design exam- 
ples for uniform, Gaussian and Laplacian random variables 
and showed that the results were consistent with the high 
resolution approximations. Although Method II would ini- 
tially gain more popularity when rediscovered in 1960 by 
Max [349], it is Method I that easily extends to vector 
quantizers and many types of quantizers with structural 
constraints. 

Third, motivated by the work of Pantcr and Ditc but 
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apparently unaware of that of Bennett or Smith, Lloyd 
redcrived Bennett's integral and the Pantcr-Ditc formula 
based on the concept of point-density function. This was 
a critically important step for subsequent generalizations 
of Bennett's integral to vector quantizers. He also showed 
directly that in situations where the global optimum is the 
only local optimum, quantizers that satisfy the optimality 
conditions have, asymptotically, the optimal point density 
given by (9). 

Unfortunately Lloyd's work was not published in an 
archival journal at the time. Instead, it was presented at 
the 1957 Institute of Mathematical Statistics (IMS) meet- 
ing and appeared in print only as a Bell Laboratories Tech- 
nical Memorandum. As a result, its results were not widely 
known in the engineering literature for many years, and 
many were independently rediscovered. All of the indepen- 
dent rediscoveries, however, used variational derivations, 
rather than Lloyd's simple derivations. The latter were es- 
sential for later extensions to vector quantizers and to the 
development of many quantizer optimization procedures. 
To our knowledge, the first mention of Lloyd's work in 
the IEEE literature came in 1964 with Fleischer's [170] 
derivation of a sufficient condition (namely, that the log 
of the source density be concave) in order that the optimal 
quantizer be the only locally optimal quantizer, and con- 
sequently, that Lloyd's Method I yields a globally optimal 
quantizer. (The condition is satisfied for common densities 
such as Gaussian and Laplacian.) Zador [561] had referred 
to Lloyd a year earlier in his Ph.D. thesis, to be discussed 
later. 

Later in the same year in another Bell Telephone Labora- 
tories Technical Memorandum, Goldstein [207] used vari- 
ational methods to derive conditions for global optimal- 
ity of a scalar quantizer in terms of second order partial 
derivatives with respect to the quantizer levels and thresh- 
olds. He also provided a simple counterintuitive example 
of a symmetric density for which the optimal quantizer was 
asymmetric. 

In 1959 Shtcin [471] added terms representing overload 
distortion to the A 2 /12 formula and to Bennett's inte- 
gral and used them to optimize uniform and nonuniform 
quantizers. Unaware of prior work except for Bennett's, 
he redcrived the optimal compressor characteristic and the 
Pantcr-Ditc formula. 

In 1960 Max [349] published a variational proof of the 
Lloyd optimality properties for rth power distortion mea- 
sures, rediscovered Lloyd's Method II, and numerically in- 
vestigated the design of fixed-rate quantizers for a variety 
of input densities. 

Also in I960, Widrow [529] derived an exact formula for 
the characteristic function of a uniformly quantized signal 
when the quantizer has an infinite number of levels. His 
results showed that under the condition that the character- 
istic function of the input signal be zero when its argument 
is greater than 7r/A, the moments of the quantized random 
variable arc the same as the moments of the signal plus 
an additive signal-independent random variable uniformly 
distributed on (-A/2, A/2]. This has often been misinter- 



preted as saying that the quantized random variable can be 
approximated as being the input plus signal-independent 
uniform noise, a clearly false statement since the quantizer 
error q(X) — X is a deterministic function of the signal. 
The "bandlimitcd" property of the characteristic function 
implies from Fourier transform theory that the probability 
density function must have infinite support since a signal 
and its transform cannot both be perfectly bandlimitcd. 

We conclude this subsection by mentioning early work 
that appeared in the mathematical and statistical litera- 
ture and which, in hindsight, can be viewed as related to 
scalar quantization. Specifically, in 1950-1951 Dalcnius ct 
al. [118], [119] used variational techniques to consider op- 
timal grouping of Gaussian data with respect .to average 
squared error. Lukaszcwicz and H. Stcinhaus [336] (1955) 
developed what we now consider to be the Lloyd optimal- 
ity conditions using variational techniques in a study of 
optimum go/no- go gauge sets (as acknowledged by Lloyd). 
Cox in 1957 [111] also derived similar conditions. Some 
additional early work, which can now be seen as relating 
to vector quantization, will be reviewed later [480], [159], 
[561]. ■ 

B. Scalar Quantization with Memory 

It was recognized early that common sources such as 
speech and images had considerable "redundancy" that 
scalar quantization could not exploit. The term "redun- 
dancy" was commonly used in the early days and is still 
popular in some of the quantization literature. Strictly 
speaking it refers to the statistical correlation or depen- 
dence between the samples of such sources and is usually 
referred to as memory in the information theory literature. 
As our current emphasis is historical, we follow the tra- 
ditional language. While not disrupting the performance 
of scalar quantizers, such redundancy could be exploited 
to attain substantially better rate-distortion performance. 
The early approaches toward this end combined linear pro- 
cessing with scalar quantization, thereby preserving the 
simplicity of scalar quantization while using intuition-based 
arguments and insights to improve performance by incor- 
porating memory into the overall code. The two most im- 
portant approaches of this variety were predictive coding 
and transform coding. A shared intuition was that a pre- 
processing operation intended to make scalar quantization 
more efficient should "remove the redundancy" in the data. 
Indeed, to this day there is a common belief that data com- 
pression is equivalent to redundancy removal and that data 
without redundancy cannot be further compressed. As will 
be discussed later, this belief is contradicted both by Shan- 
non's work, which demonstrated strictly improved perfor- 
mance using vector quantizers even for memory less sources, 
and by the early work of Fcjcs Toth (1959) [159]. Neverthe- 
less, removing redundancy leads to much improved codes. 

Predictive quantization appears to originate in the 1946 
delta modulation patent of Dcrjavitch, Dclorainc, and Van 
Micrlo [129], but the most commonly cited early references 
arc Cutler's patent [117] 2,605,361 on "Differential quanti- 
zation of communication signals" and on DcJagcr's Philips 
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technical report on delta modulation [128]. Cutler stated 
in his patent that it "is the object of the present inven- 
tion to improve the efficiency of communication systems 
by taking advantage of correlation in the signals of these 
systems" and Dcrjavitch ct al. also cited the reduction 
of redundancy as the key to the reduction of quantization 
noise. In 1950 Eli as [141] provided an information theoretic 
development of the benefits of predictive coding, but the 
work was not published until 1955 [142]. Other early refer- 
ences include [395], [300], [237], [511], [572]. In particular, 
[511] claims Bennett-style asymptotics for high resolution 
quantization error, but as will be discussed later such ap- 
proximations have yet to be rigorously derived. 

From the point of view of least squares estimation the- 
ory, if one were to optimally predict a data sequence based 
on its past in the sense of minimizing the mean squared 
error, then the resulting error or residual or innovations se- 
quence would be uncorrclatcd and it would have the min- 
imum possible variance. To permit reconstruction in a 
coded system, however, the prediction must be based on 
past reconstructed samples and not true samples. This 
is accomplished by placing a quantizer inside a prediction 
loop and using the same predictor to decode the signal. A 
simple predictive quantizer or differential pulse coded mod- 
ulator (DPCM) is depicted in Figure 3. If the predictor is 
simply the last sample and the quantizer has only one bit, 
the system becomes a delta-modulator. Predictive quantiz- 
ers arc considered to have memory in that the quantization 
of a sample depends on previous samples, via the feedback 
loop. 
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Predictive quantizers have been extensively developed, 
for example there arc many adaptive versions, and arc 
widely used in speech and video coding, where a number 
of standards arc based on them. In speech coding they 
form the basis of ITU-G.721, 722, 723, and 726, and in 
video coding they form the basis of the intcrframc cod- 
ing schemes standardized in the MPEG and H.26X scries. 
Comprehensive discussions may be found in books [265], 
[374], [196], [424], [50], [458] and survey papers [264], [198]. 



Though dccorrclation was an early motivation for predic- 
tive quantization, the most common view at present is that 
the primary role of the predictor is to reduce the variance 
of the variable to be scalar quantized. This view stems 
from the facts that (a) it is the prediction errors rather 
than the source samples that arc quantized, (b) the over- 
all quantization error precisely equals that of the scalar 
quantizer operating on the prediction errors, (c) the op- 
erational distortion-rate function 6(R) for scalar quantiza- 
tion is proportional to variance (more precisely, a scaling 
of the random variable being quantized by a factor a re- 
sults in a scaling of 6(R) by a 2 ), and (d) the density of 
the prediction error is usually sufficiently similar in form 
to that of the source that its operational distortion-rate 
function is smaller than that of the original source by, ap- 
proximately, the ratio of the variance of the source to that 
of the prediction error, a quantity that is often called a 
prediction gain [350], [396], [482], [397], [265]. Analyses 
of this form usually claim that under high-resolution con- 
ditions the distribution of the prediction error approaches 
that of the error when predictions arc based on past source 
samples rather than past reproductions. However, it is not 
clear that the accuracy of this approximation increases suf- 
ficiently rapidly with finer resolution to ensure that the dif- 
ference between the operational distortion-rate functions of 
the two types of prediction errors is small relative to their 
values, which arc themselves decreasing as the resolution 
becomes finer. Indeed, it is still an open question whether 
this type of analysis, which typically uses Bennett and 
Pantcr-Ditc formulas, is asymptotically correct. Neverthe- 
less, the results of such high-resolution approximations arc 
widely accepted and often compare well with experimental 
results [265], [156]. Assuming that they give the correct an- 
swer, then for large rates and a stationary, Gaussian source 
with memory, the distortion of an optimized DPCM quan- 
tizer is less than that of a scalar quantizer by the factor 
cr|/(7 2 , where cr 2 is the variance of the source and cr 2 is the 
one-step prediction error; i.e. the smallest MSE of any pre- 
diction of one sample based on previous samples. It turns 
out that this exceeds S(R) by the same factor by which the 
distortion of optimal fixed-rate scalar quantization exceeds 
S(R) for a mcmorylcss Gaussian source. Hence, it appears 
that DPCM docs a good job of exploiting source memory 
given that it is based on scalar quantization, at least under 
the high-resolution assumption. 

Because it has not been rigorously shown that one may 
apply Bennett ? s integral or the Pantcr-Ditc formula di- 
rectly to the prediction error, the analysis of such feedback 
quantization systems has proved to be notoriously diffi- 
cult, with results limited to proofs of stability [191], [281], 
[284], i.e. asymptotic stationarity, to analyses of distor- 
tion via Hcrmitc polynomial expansions for Gaussian pro- 
cesses [124], [473], [17], [346], [241], [262], [156], [189], [190], 
[367], [368], [369], [293], to analyses of distortion when the 
source is a Wiener process [163], [346], [240], and to ex- 
act solutions of the nonlinear difference equations describ- 
ing the system and hence to descriptions of the output se- 
quences and their moments, including power spectral den- 
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sitics, for constant and sinusoidal signals and finite sums 
of sinusoids using Rice's method, results which extend the 
work of Pantcr, Clavier and Grieg to quantizers inside a 
feedback loop [260], [71], [215], [216], [72]. Conditions for 
use in code design resembling the Lloyd optimality con- 
ditions have been studied for feedback quantization [161], 
[203], [41], but the conditions arc not optimality conditions 
in the Lloyd sense, i.e., they arc not necessary conditions 
for a quantizer within a feedback loop to yield the mini- 
mum average distortion subject to a rate constraint. We 
will return to this issue when we consider finite-state vec- 
tor quantizers. There has also been work on the optimality 
of certain causal coding structures somewhat akin to pre- 
dictive or feedback quantization [331], [414], [148], [534], 
[178], [381], [521]. 

Transform coding is the second approach to exploiting 
redundancy by using scalar quantization with linear pre- 
processing. Here, the source samples arc collected into a 
vector of, say, dimension k that is multiplied by an orthog- 
onal matrix (an orthogonal transform) and the resulting 
transform coefficients arc scalar quantized, usually with a 
different quantizer for each coefficient. The operation is 
depicted in Figure 4. This style of code was introduced in 
1956 by Kramer and Mathews [299] and analyzed and pop- 
ularized in 1962-3 by Huang and Schulthciss [247], [248]. 
Kramer and Mathews simply assumed that the goal of the 
transform was to decor relate the symbols, but Huang and 
Schulthciss proved that dccorrclating docs indeed lead to 
optimal transform code design, at least in the case of Gaus- 
sian sources and high resolution. Transform coding has 
been extensively developed for coding images and video, 
where the discrete cosine transform (DCT) [7], [429] is 
most commonly used because of its computational simplic- 
ity and its good performance. Indeed DCT coding is the 
basic approach dominating current image and video cod- 
ing standards, including H. 261, H.263, JPEG, and MPEG. 
These codes combine uniform scalar quantization of the 
transform coefficients with an efficient lossless coding of the 
quantizer indices, as will be considered in the next section 
as a variable-rate quantizer. For discussions of transform 
coding for images sec [533], [422], [375], [265], [98], [374], 



[261], [424], [196], [208], [408], [50], [458]. More recently 
transform coding has also been widely used in high fidelity 
audio coding [272], [200]. 

Unlike predictive quantizers, the transform coding ap- 
proach lent itself quite well to the Bennett high resolu- 
tion approximations, the classical analysis being that of 
Huang and Schulthciss [247], [248] of the performance of 
optimized transform codes for fixed-rate scalar quantizers 
for Gaussian sources, a result which demonstrated that the 
Karhuncn-Locvc dccorrclating transform was optimum for 
this application for the given assumptions. If the transform 
is the Karhuncn-Locvc transform, then the coefficients will 
be uncorrclatcd (and hence independent if the input vec- 
tor is also Gaussian). The seminal work of Huang and 
Schulthciss showed that high resolution approximation the- 
ory could provide analytical descriptions of optimal per- 
formance and design algorithms for optimizing codes of a 
given structure. In particular they showed that under the 
high-resolution assumptions with Gaussian sources, the av- 
erage distortion of the best transform code with a given rate 
is less than that of optimal scalar quantization by the fac- 
tor (dct/fj( : ) 1 / fc /<7 2 , where a 2 is the average of the variances 
of the components of the source vector and is its k x k 
covariancc matrix. Note that this reduction in distortion 
becomes larger for sources with more memory (more corre- 
lation) because the covariancc matrices of such sources have 
smaller determinants. When k is large, it turns out that the 
distortion of optimized transform coding with a given rate 
exceeds 6(R) by the same factor by which the distortion 
of optimal fixed- rate scalar quantization exceeds S(R) for 
a mcmorylcss Gaussian source. Hence, like DPCM, trans- 
form coding docs a good job of exploiting source memory 
given that it is a system based on scalar quantization. 

C. Variable- Rate Quantization 

Shannon's lossless source coding theory (1948) [464] 
made it clear that assigning equal numbers of bits to all 
quantization cells is wasteful if the cells have unequal prob- 
abilities. Instead, the number of bits produced by the 
quantizer will, on the average, be reduced if shorter bi- 
nary codewords arc assigned to higher probability cells. Of 
course this means that longer codewords will need to be 
assigned to the less probable cells, but Shannon's theory 
shows that, in general, there is a net gain. This leads 
directly to variable-rate quantization, which has the par- 
tition into cells and codebook of levels as before, but now 
has binary codewords of varying lengths assigned to the 
cells (alternatively the levels). Ordinarily, the set of binary 
codewords is chosen to satisfy the prefix condition that no 
member is a prefix of another member, in order to insure 
unique dccodability. As will be made precise in the next 
section, one may view a variable-rate quantizer as consist- 
ing of a partition, a codebook and a lossless binary code, 
i.e. an assignment of binary codewords. 

For variable- rate quantizers the rate is no longer defined 
as the logarithm of the codebook size. Rather, the instanta- 
neous rate for a given input is the number of binary symbols 
in the binary codeword (the length of the binary codeword) 
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and the rate is the average length of the binary codewords, 
where the average is taken over the probability distribu- 
tion of the source samples. The operational distortion- 
rate function 6(R) using this definition is the smallest av- 
erage distortion over all (variable-rate) quantizers having 
rate R or less. Since we have weakened the constraint by 
expanding the allowed set of quantizers, this operational 
distortion- rate function will ordinarily be smaller than the 
fixed-rate optimum. 

Huffman's algorithm [251] provides a systematic method 
of designing binary codes with the smallest possible average 
length for a given set of probabilities, such as those of the 
cells. Codes designed in this way arc typically called Huff- 
man codes. Unfortunately, there is no known expression 
for the resulting minimum average length in terms of the 
probabilities. However, Shannon's lossless source coding 
theorem implies that given a source and a quantizer par- 
tition, one can always find an assignment of binary code- 
words (indeed a prefix set) with average length not more 
than H(q(X))+ 1, and that no uniquely dccodablc set of bi- 
nary codewords can have average length less than H(q(X)), 
where * 

H(q(X))=-Y / P i \°gP i 
i 

is the Shannon entropy of the quantizer output and Pi = 
Pt(X £ Si) is the probability that the source sample X 
lies in the tth cell Si. Shannon also provided a simple way 
of attaining performance within the upper bound: if the 
quantizer index is t, then assign it a binary codeword with 
length [— log Pi] (the Kraft inequality ensures that this 
is always possible by simply choosing paths in a binary 
tree). Moreover, tighter bounds have been developed. For 
example Galiagcr [181] has shown that the entropy can be 
at most Pmax + -0861 smaller than the average length of the 
Huffman code, when P ma x, the largest of the iVs, is less 
than 1/2. Sec [73] for discussion of this and other bounds. 

SinCC Prnax 

is ordinarily much smaller than 1/2, this shows 
that H(q(X)) is generally a fairly accurate estimate of the 
average rate, especially in the high-resolution case. 

Since there is no simple formula determining the rate of 
the Huffman code, but entropy provides a useful estimate, 
it is reasonable to simplify the variable-length quantizer do- 
sign problem a little by redefining the instantaneous rate 
of a variable-rate quantizer as — log Pi for the tth quan- 
tizer level and hence to define the average rate as H(q(X)) ) 
the entropy of its output. As mentioned above, this un- 
derestimates the true rate by a small amount that in no 
case exceeds one. We could again define an operational 
distortion- rate function as the minimum average distor- 
tion over all variable-rate quantizers with output entropy 
H(q(X)) < R. Since the quantizer output entropy is a 
lower bound to actual rate, this operational distortion-rate 
function may be optimistic; i.e., it falls below S(R) defined 
using average length as rate. A quantizer designed to pro- 
vide the smallest average distortion subject to an entropy 
constraint is called an entropy- constrained scalar quantizer. 

Variable-rate quantization is also called variable-length 
quantization or quantization with entropy coding. We 



will not, except where critical, take pains to distinguish 
entropy-constrained quantizers and entropy-coded quantiz- 
ers. And we will usually blur the distinction between av- 
erage length and entropy as measures of the rate of such 
quantizers unless, again, it is important in some particular 
discussion. This is much the same sort of blurring as using 
log N instead of [logiV] as the measure of rate in fixed-rate 
quantization. 

It is important to note that the number of quantization 
cells or levels docs not play a primary role in variable-rate 
quantization because, for example, there can be many levels 
in places where the source density is small with little effect 
on cither distortion or rate. Indeed the number of levels 
can be infinite, which has the advantage of eliminating the 
overload region and resulting overload distortion. 

A potential drawback of variable-rate quantization is the 
necessity of dealing with the variable numbers of bits that 
it produces. For example, if the bits arc to be communi- 
cated through a fixed- rate digital channel, one will have to 
use buffering and to take buffer overflows and underflows 
into account. Another drawback is the potential for error 
propagation when bits arc received by the decoder in error. 

The most basic and simple example of a variable-rate 
quantizer, and one which plays a fundamental role as a 
benchmark for comparison, is a uniform scalar quantizer 
with a variable-length binary lossless code. 

The possibility of applying variable-length coding to 
quantization may well have occurred to any number of *p co- 
pic who were familiar with both quantization and Shan- 
non's 1948 paper. The earliest references to such that we 
have found arc in the 1952 papers by Oliver [395] and Krct- 
zmcr [300]. In 1960, Max [349] had such in mind when he 
computed the entropy of nonuniform and uniform quan- 
tizers that had been designed to minimize distortion for a 
given number of levels. For a Gaussian source his results 
show that variable-length coding would yield rate reduc- 
tions of about 0.5 bits/sample. 

High-resolution analysis of variable- rate quantization de- 
veloped in a handful of papers from 1958 to 1968. However, 
since these papers were widely scattered or unpublished, it 
was not until 1968 that the situation was well understood 
in the IEEE community. 

The first high-resolution analysis was that of Schutzcn- 
bcrgcr (1958) [462] who showed that the distortion of opti- 
mized variable-rate quantization (both scalar and vector) 
decreases with rate as 2~ 2 R , just as with fixed-rate quanti- 
zation. But he did not find the multiplicative factors, nor 
did he describe the nature of the partitions and codebooks 
that arc best for variable-rate quantization. 

In 1959, Rcnyi [433] showed that a uniform scalar quan- 
tizer with infinitely many levels and small cell width A has 
output entropy given approximately by 

H(q(X)) £ h(X) - logA (11) 

where 

= } (x) log f(x)dx 
is the differential entropy of the source variable X." 
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In 1963, Koshclcv [579] discovered the very interesting 
fact that in the high-resolution case, the mean-squared er- 
ror of uniform scalar quantization exceeds that of the least 
distortion achievable by any quantization scheme whatso- 
ever, i.e. 6(R)> by a factor of only ttc/6 = 1.42. Equiv- 
alcntly, the induced signal-to-noisc ratio is only 1.53 dB 
less than the best possible, or for a fixed distortion D, the 
rate is only 0.255 bit/sample larger than that achievable 
by the best quantizers. (For the Gaussian source, it gains 
2.82 dB or 0.47 bit/sample over the best fixed-rate scalar 
quantizer.) It is also of interest to note that this was the 
first paper to compare the performance of a specific quan- 
tization scheme to 8(R). Unfortunately, Koshclcv's paper 
was published in a journal that was not widely circulated. 

In an unpublished 1966 Bell Telephone Laboratories 
Tchnical Memo [562], Zador also studied variable-rate (as 
well as fixed-rate) quantization. As his focus was on vec- 
tor quantization, his work will be described later. Here wc 
only point out that for variable-rate scalar quantization 
with large rate, his results showed that the operational 
distortion-rate- function (i.e. the least distortion of such 
codes with a given rate) is 

6(R) s }_2 2h ( x h- 2R (12) 

Though he was not aware of it, this turns out to be the 
formula found by Koshclcv, thereby demonstrating that in 
the high-resolution case, uniform is the best type of scalar 
quantizer when variable-rate coding is applied. 

Finally, in 1967 and 1968 two papers appeared in the 
IEEE literature (in fact in these Transactions) on variable- 
rate quantization, without reference to any of the afore- 
mentioned work. The first, by Goblick and Holsingcr [205], 
showed by numerical evaluation that uniform scalar quan- 
tization with variable-rate coding attains performance 
within about 1.5 dB (or 0.25 bit/sample) of the best possi- 
ble for an i.i.d. Gaussian source. The second, by Gish and 
Pierce [204], demonstrated analytically what the first pa- 
per had found empirically. Specifically, it derived (11) and, 
more generally, the fact that a high resolution nonuniform 
scalar quantizer has output entropy 

H(q(X)) S h(X) + J f(x)\ogA{x)dx i (13) 

where A(x) is the unnormalizcd point density of the quan- 
tizer. They then used these approximations along with 
Bennett's integral to redcrive (12) and to show that in the 
high-resolution case, uniform scalar quantizers achieve the 
operational distortion-rate function of variable-rate quanti- 
zation. Next, by comparing to what is called the Shannon 
lower bound to 6(R), they showed that for i.i.d. sources, 
the latter is only 1.53 dB (0.255 bits/sample) from the 
best possible performance 6(R) of any quantization system 
whatsoever, which is what Koshclcv [579] found earlier. 
Their results showed that such good performance was at- 
tainable for any source distribution, not just the Gaussian 
case checked by Goblick and Holsingcr. They also gener- 
alized the results from squarcd-crror distortion to nondo- 
crcasing functions of magnitude error. 



Less well known is their proof of the fact that in the 
high resolution case, the entropy of k successive outputs of 
a uniformly scalar quantized stationary source, e.g. with 
memory, is 

*(<?(*i), • ..,«(**)) = KX lt ...,**)- A. (14) 

They used this, and the generalization of (13) to vectors, to 
show that when rate and k arc large, uniform scalar quanti- 
zation with variable-length coding of k successive quantizer 
outputs (block entropy coding) achieves performance that 
is 1.53 dB (0.255 bits/sample) from 6(R), even for sources 
with memory. (They accomplished this by comparing to 
Shannon lower bounds.) This important result was not 
widely appreciated until rediscovered by Ziv (1985) [578], 
who also showed that a similar result holds for small rates. 
Note that although uniform scalar quantizers arc quite sim- 
ple, the lossless code capable of approaching the Arth-ordcr 
entropy of the quantized source can be quite complicated. 
In addition, Gish and Pierce observed that when coding 
vectors, performance could be improved by using quan- 
tizer cells other than the cube implicitly used by uniform 
scalar quantizers and noted that the hexagonal cell was 
superior in two dimensions, as originally demonstrated by 
Fcjcs Toth [159] and Newman [385]. 

Though uniform quantization is asymptotically best for 
entropy-constrained quantization, at lower rates nonuni- 
form quantization can do better, and a scries of papers ex- 
plored algorithms for designing them. In 1969 Wood [539] 
provided a numerical descent algorithm for designing an 
entropy-constrained scalar quantizer, and showed, as pre- 
dicted by Gish and Pierce, that the performance was only 
slightly superior to a uniform scalar quantizer followed by 
a lossless code. 

In a 1972 paper dealing with a vector quantization tech- 
nique to be discussed later, Bcrgcr [47] described Lloyd-like 
conditions for optimality of an entropy-constrained scalar 
quantizer for squarcd-crror distortion. He formulated the 
optimization as an unconstrained Lagrangian minimization 
and developed an iterative algorithm for the design of en- 
tropy constrained scalar quantizers. He showed that Gish 
and Pierce's demonstration of approximate optimality of 
uniform scalar quantization for variable-rate quantization 
holds approximately even when the rate is not large and 
holds exactly for exponential densities, provided the levels 
arc placed at the ccntroids. In 1976 Nctravali and Saigal 
introduced a fixed-point algorithm with the same goal of 
minimizing average distortion for a scalar quantizer with an 
entropy constraint [376]. Yet another approach was taken 
by Noll and Zclinski (1978) [391]. Bcrgcr refined his ap- 
proach to entropy-constrained quantizer design in [48]. 

Variable-rate quantization was also extended to DPCM 
and transform coding, where high resolution analysis shows 
that it gains the same relative to fixed-rate quantization as 
it docs when applied to direct scalar quantizing [398], [154]. 
Wc note, however, that the variable-rate quantization anal- 
ysis for DPCM suffers from the same flaws as the fixed-rate 
quantization analysis for DPCM. 

Numerous extensions of the Bennett-style asymptotic 
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approximations and the approximation of r(D) or 6(R) and 
the characterizations of properties of optimal high resolu- 
tion quantization for both fixed- and variable-rate quanti- 
zation for squared error and other error moments appeared 
during the 1960's, e.g., [497], [498], [55], [467], [8]. An ex- 
cellent summary of the early work is contained in a 1970 
paper by Elias [143]. 

We close this section with an important practical obser- 
vation. The current JPEG and related standards can be 
viewed as a combination of transform coding and variable- 
length quantization. It is worth pointing out how the stan- 
dard resembles and differs from the models considered thus 
far. As previously stated, the transform coefficients arc 
separately quantized by possibly different uniform quan- 
tizers, the bin lengths of the quantizers being determined 
by a customizable quantization table. This typically pro- 
duces a quantized transformed image with many zeros. The 
lossless, variable-length code then scans the image in a zig- 
zag (or Pcano) fashion, producing a sequence of runlcngths 
of the zeros and indices corresponding to nonzero values, 
which arc then Huffman coded (or arithmetic coded). This 
procedure has the effect of coding only the transform coeffi- 
cients with the largest magnitude, which arc the ones most 
important for reconstruction. The early transform coders 
typically coded the first, say, K coefficients, and ignored 
the rest. In essence, the method adopted for the standards 
selectively coded the most important coefficients, i.e., those 
having the largest magnitude, rather than simply the low- 
est frequency coefficients. The runlcngth coding step can 
in hindsight be viewed as a simple way of locating the most 
significant coefficients, which in turn arc described the most 
accurately. This implicit "significance" map was an early 
version of an idea that would later be essential to wavelet 
coders. 

D. The Beginnings of Vector Quantization 

As described in the three previous subsections, the 1940's 
through the early 1970's produced a steady stream of ad- 
vances in the design and analysis of practical quantization 
techniques, principally scalar, predictive, transform and 
variable-rate quantization, with quantizer performance im- 
proving as these decades progressed. On the other hand, at 
roughly the same time there was a parallel scries of devel- 
opments that were more concerned with the fundamental 
limits of quantization than with practical quantization is- 
sues. We speak primarily of the remarkable work of Shan- 
non and the very important work of Zador, though there 
were other important contributors as well. This work dealt 
with what is now called vector quantization (VQ) (or block 
or multidimensional quantization), which is just like scalar 
quantization except that all components of a vector, of say 
k successive source samples, arc quantized simultaneously. 
As such they arc characterized by a Ar-dimcnsional parti- 
tion, a ifc-dimcnsional codebook (containing A:-dimcnsional 
points, reproduction codewords or codcvcctors) , and an as- 
signment of binary codewords to the cells of the partition 
(cquivalcntly, to the codcvcctors). 

An immediate advantage of vector quantization is that it 



provides a model of a general quantization scheme operat- 
ing on vectors without any structural constraints. It clearly 
includes transform coding as a special case and can- also be 
considered to include predictive quantization operating lo- 
cally within the vector. This lack of structural constraints 
makes the general model more amenable to analysis and 
optimization. In these early decades vector quantization 
served primarily as a paradigm for exploring fundamental 
performance limits; it was not yet evident whether it would 
become a practical coding technique. 

Shannon's Source Coding Theory 

In his classic 1948 paper, Shannon [464] sketched the idea 
of the rate of a source as the minimum bit rate required 
to reconstruct the source to some degree of accuracy as 
measured by a fidelity criterion such as mean squared error. 
The sketch was fully developed in his 1959 paper [465] for 
i.i.d. sources, additive measures of distortion, and block 
source codes, now called vector quantizers. In this later 
paper Shannon showed that when coding at some rate R, 
the least distortion achievable by vector quantizers of any 
kind is equal to a function D(R), subsequently called the 
Shannon distortion-rate function, that is determined by the 
statistics of the source and the measure of distortion: 2 

To elaborate on Shannon's theory, we note that one can 
immediately extend the quantizer notation of (1), the dis- 
tortion and rate definitions of (2)-(3), and the operational 
distortion-rate functions to define the smallest distortion 
h{R) possible for a lb-dimensional fixed-rate vector quan- 
tizer that achieves rate R or less. (The distortion between 
two ib-dimcnsional vectors is defined to be the numerical 
average of the distortions between their respective compo- 
nents. The rate is l/k times the (average) number of bits 
to describe a fc-dimcnsional source vector.) We will make 
the dimension k explicit in the notation when we arc allow- 
ing it to vary and omit it when not. Furthermore, as with 
Shannon's channel coding and lossless source coding thco- 
rics,-onc can consider the best possible performance over 
codes of all dimensions (assuming the data can be blocked 
into vectors of arbitrary size) and define an operational 
distortion-rate function 

6(R) = ini6 k (R). (15) 

The operational rate-distortion functions r k (D) and r(D) 
arc defined similarly. For finite dimension k the function 
8k(R) will depend on the definition of rate, i.e., whether 
it is the log of the reproduction size, the average binary 
codeword length, or the quantizer output entropy. It turns 
out, however, that 6(R) is not affected by this choice. That 
is, it is the same for all definitions of rate. 

For an i.i.d. source [X n ], the Shannon distortion-rate 
function D(R) is defined as the minimum average distortion 
E[d(X, Y)] over all conditional distributions of Y given X 

2 Actually, Shannon described the solution to the equivalent prob- 
lem of minimizing rate subject to a distortion constraint and found 
that the answer was given by a function R(D), subsequently called 
the Shannon rate- distortion function, which is the inverse of D(R). 
Accordingly, the theory is often called rate distortion theory, cf. [46]. 
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for which the mutual information 1(X\ Y) is at most R, 
where we emphasize that X and Y arc scalar variables here. 
In his principal result, the coding theorem for source coding 
with a fidelity criterion , Shannon showed that for every R y 
6(R) = D(R). That is, no VQ of any dimension k with 
rate R could yield smaller average distortion than D(R), 
and that for some dimension — possibly very large — there 
exists a VQ with rate no greater than R and distortion 
very nearly D(R). As an illustrative example, the Shannon 
distortion-rate function of an i.i.d. Gaussian source with 



variance a" is 



D(R) = <7 2 2" 2 *, 



(16) 



where a 2 is the variance of the source. Equivalcntly, 
the Shannon rate-distortion function is R(D) = -log £5-, 
G < D < a 2 . . Since it is also known that this represents 
the best possible performance of any quantization scheme 
whatsoever, it is these formulas that we used previously 
when comparing the performance of scalar quantizers to 
that of the best quantization schemes. For example, com- 
paring (16) and (10), one sees why we made earlier the 
statement that the operational distortion-rate function of 
scalar quantization is times larger than 6{R). No- 

tice that (16) shows that for this source the 2~ 2R expo- 
nential rate of decay of distortion with rate, demonstrated 
by high resolution arguments for high rates, extends to all 
rates. This is not usually the case for other sources. 

Shannon's approach was subsequently generalized to 
sources with memory, cf. [180], [45], [46], [218], [549], [127], 
[126], [282], [283], [138], [479]. The general definitions of 
distortion- rate and rate-distortion functions resemble those 
for operational distortion-rate and rate-distortion functions 
in that they arc infima of fcth-ordcr functions. For ex- 
ample, the fcth-ordcr distortion-rate function Dk(R) of a 
stationary random process {X n } is defined as an infimum 
of the average distortion E[d(X, Y)] over all conditional 
probability distributions of Y = (Yi, Y 2 , . . . ,Y*) given 
X = (Xi J X 2) . . Xk) for which average mutual informa- 
tion %I(X, Y) < R. The distortion-rate function function 
for the processes then given by D(R) — inf* Dk(R). For 
i.i.d. sources D(R) = D\(R), where D\(R) is what we pre- 
viously called D(R) for i.i.d. sources. (The rate-distortion 
functions Rk(D) and R(D) arc defined similarly.) A source 
coding theorem then shows under appropriate conditions 
that, for sources with memory, 6(R) = D(R) for all rates 
R. In other words, Shannon's distortion-rate function rep- 
resents an asymptotically achievable, but never beatable, 
lower bound to the performance of any VQ of any dimen- 
sion. The positive coding theorem demonstrating that the 
Shannon distortion-rate function is in fact achievable if one 
allows codes of arbitrarily large dimension and complexity 
is difficult to prove, but the existence of good codes rests on 
the law of large numbers, suggesting that large dimensions 
might indeed be required for good codes, with consequently 
large demands on complexity, memory, and delay. 

Shannon's results, like those of Pantcr and Ditc, Zador, 
and Gish and Pierce provide benchmarks for comparison 
for quantizers. However, Shannon's results provide an in- 



teresting contrast with these early results on quantizer per- 
formance. Specifically, the early quantization theory had 
derived the limits of scalar quantizer performance based on 
the assumption of high resolution and showed that these 
bounds were achievable by a suitable choice of quantizer. 
Shannon, on the other hand, had fixed a finite, nonasymp- 
totic rate, but had considered asymptotic limits as the di- 
mension k of a vector quantizer was allowed to become 
arbitrarily large. The former asymptotics, high resolution 
for fixed dimension, arc generally viewed as quantization 
theory, while the latter, fixed-rate and high dimension, arc 
generally considered to be source coding theory or informa- 
tion theory. Prior to 1960 quantization had been viewed 
primarily as PCM, a form of analog- to-digital conversion or 
digital modulation, while Shannon's source coding theory 
was generally viewed as a mathematical approach to data 
compression. The first to explicitly apply Shannon's source 
coding theory to the problem of analog-to-digital conver- 
sion combined with digital transmission appear to be Gob- 
lick and Holsingcr [205] in 1967, and the first to make ex- 
plicit comparisons of quantizer performance to Shannon's 
rate-distortion function was Koshclcv [579]. 

A distinct variation on the Shannon approach was 
introduced to the English literature in 1956 by Kol- 
mogorov [288], who described several results by Russian 
information theorists inspired by Shannon's 1948 treatment 
of coding with respect to a fidelity criterion. Kolmogorov 
considered two notions of the rate with respect to a fidelity 
criterion: His second notion was the same as Shannon's, 
where a mutual information was minimized subject to a 
constraint on the average distortion, in this case measured 
by squared error. The first peformed a similar minimiza- 
tion of mutual information, but with the requirement that 
maximum distortion between the input and reproduction 
did not exceed a specified level c. Kolmogorov referred to 
both functions as the "c-cntropy" H £ (X) of a random ob- 
ject X, but the name has subsequently been considered to 
apply to the maximum distortion being constrained to be 
less than c, rather than the Shannon function, later called 
the rate-distortion function, which constrained the average 
distortion. Note that the maximum distortion with respect 
to a distortion measure d can be incorporated in the aver- 
age distortion formulation if one considers a new distortion 
measure p defined by 



p(x, x) = | 



0 if d(x,y)<c 
oo otherwise 



(17) 



As with Shannon's rate-distortion function, this was an 
information theoretic definition. As with quantization, 
there arc corresponding operational definitions. The op- 
erational cpsilon entropy (c-cntropy) of a random variable 
X can be defined as the smallest entropy of a quantized 
output such that the reproduction is no further from the 
input than c (at least with probability 1): 

7i e (X)= inf H(q(X)). (18) 

This is effectively a variable-rate definition since lossless 
coding would be required to achieve a bit rate near the 
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entropy. Alternatively one could define the operational 
cpsilon-cntropy as log N Ci where N c is the smallest number 
of reproduction codcvcctors for which all inputs arc (with 
probability 1) within c of a codcvcctor. This quantity is 
clearly infinite if the random object X docs not have finite 
support. As in the Shannon case, all these definitions can 
be made for ^-dimensional vectors X k and the limiting be- 
havior can be studied. Results regarding the convergence 
of such limits and the equality of the information- theoretic 
and operational notions of cpsilon-cntropy can be found, 
e.g., in [421], [420], [278], [59]. Much of the theory is con- 
cerned with approximating cpsilon entropy for small c. 

Epsilon entropy extends to function approximation the- 
ory with a slight change by removing the notion of prob- 
ability. Here the cpsilon entropy becomes the log of the 
smallest number of balls of radius c required to cover a 
compact metric space (e.g., a function space) (sec, e.g., 
[520] [420] for a discussion of various notions of cpsilon en- 
tropy). 

We mention cpsilon entropy because of its close mathe- 
matical connection to rate distortion theory. Our emphasis, 
however, is on codes that minimize average, not maximum, 
distortion. 

The Earliest Vector Quantization Work 

Outside of Shannon's sketch of rate distortion theory in 
1948, the earliest work with a definite vector quantization 
flavor appeared in the mathematical and statistical litera- 
ture. Most important was the remarkable work of Stcin- 
haus in 1956 [480], who considered a problem equivalent 
to a three-dimensional generalization of scalar quantiza- 
tion with a squarcd-crror distortion measure. Suppose that 
a mass density m{x) is defined on Euclidean space. For 
any finite TV, let S = {S,-; t = 1, :.. ? N) be a partition 
of Euclidean space into N disjoint bodies (cells) and let 
£ — {yn * ' = !>•■■>#} be a collection of N vectors, one 
associated with each cell of the partition. What partition 
S and collection of vectors C minimizes 

E / rn{x)\\x- yi fdx, 

the sum of the moments of inertia of the cells about the 
associated vectors? This problem is formally equivalent 
to a fixed-rate three-dimensional vector quantizer with a 
squarcd-crror distortion measure and a probability den- 
sity m(x)/ J m(x')dx f . Stcinhaus derived what we now 
consider to be the Lloyd optimality conditions (ccntroid 
and nearest neighbor mapping) from fundamental princi- 
ples (without variational techniques), proved the existence 
of a solution, and described the iterative descent algorithm 
for finding a good partition and vector collection. His 
derivation applies immediately to any finite- dimensional 
space and hence, like Lloyd's, extends immediately to vec- 
tor quantization of any dimension. Stcinhaus was aware 
of the problems with local optima, but stated that "gen- 
erally" there would be a unique solution. No mention is 
made of "quantization," but this appears to be the first 



paper to both state the vector quantization problem and 
to provide necessary conditions for a solution, which yield 
a design algorithm. 

In 1959 Fcjcs Toth described the specific application 
of Stcinhaus' problem in two dimensions to a source 
with a uniform density on a bounded support region and 
to quantization with an asymptotically large number of 
points [159]. Using an earlier inequality of his [158], he 
showed that the optimal two-dimensional quantizer un- 
der these assumptions tessellated the support region with 
hexagons. This was the first evaluation of the performance 
of a genuinely multidimensional quantizer. It was redcrived 
in a 1964 Bell Laboratories Technical Memorandum by 
Newman [385]; its first appearance in English. It made 
a particularly important point: even in the simple case 
of two independent uniform random variables, with no re- 
dundancy to remove, the performance achievable by quan- 
tizing vectors using a hexagonal lattice encoding partition 
is strictly better than that achievable by uniform scalar- 
quantization, which can be viewed as a two- dimensional 
quantizer with a square encoding lattice. 

The first high resolution approximations for vector quan- 
tization were published by Schutzcnbcrgcr in 1958 [462], 
who found upper and lower bounds to the least distortion of 
^-dimensional variable-rate vector quantizers, both of the 
form K2~ 2R . Unfortunately, the upper and lower bounds 
diverge as k increases. 

In 1963 Zador [561] made a very large advance by using 
high-resolution methods to show that for large rates, the 
operational distortion-rate function of fixed- rate quantiza- 
tion has the form 

6 k {R)^b k \\f\\^2-^ (19) 

where 6* is a term that is independent of the source, f(x) 
is the ^-dimensional source density, and 

11/11^= (//^(x)^)"* 2 

is the term that depends on the source. This generalized 
the Pantcr-Ditc formula to the vector case. While the for- 
mula for Sk(R) obviously matches the Shannon distortion- 
rate function D(R) when both dimension and rate arc 
large (because in this case both arc approximations to 
Sk(R) = <$(#)), Zador's formula has the advantage of being 
applicable for any dimension k while the Shannon theory 
is applicable only for large k. On the other hand, Shan- 
non theory is applicable for any rate R while high res- 
olution theory is applicable only for large rates. Thus, 
the two theories arc complementary. Zador also explic- 
itly extended Lloyd's optimality properties to vectors with 
distortion measures that were integer powers of the Eu- 
clidean norm, thereby also generalizing Stcinhaus' results 
to dimensions higher than three, but he did not specifically 
consider descent design algorithms. Unfortunately, the re- 
sults of Zador's thesis were not published until 1982 [563] 
and were little known outside of Bell Laboratories until 
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Gcrsho's important paper of 1979 [193], to be described 
later. 

Zador's dissertation also dealt with the analysis of 
variable-rate vector quantization, but the asymptotic for- 
mula given there is not the correct one. Rather it was left to 
his subsequent unpublished 1966 memo [562] to derive the 
correct formula. (Curiously, his 1982 paper [563] reports 
the formula from the thesis rather than the memo.) Again 
using high-resolution methods, he showed that for large 
rates, the operational distortion-rate function of variable- 
rate vector quantization has the form 

Mfl) = c Jk 2 2M *)2- 2R , (20) 

where is a term that is independent of the source and 
hk = . . is the dimension-normalized differ- 

ential entropy of the source. This completed what he and 
Schutzcnbcrgcr had begun. 

In the mid-1960's the optimality properties described by 
Stcinhaus, Lloyd, and Zador and the design algorithm of 
Stcinhaus and Lloyd were rediscovered in the statistical 
clustering literature. Similar algorithms were introduced in 
1965 by Forgcy [172], Ball and Hall [29], [230], Janccy [263], 
and in 1969 by MacQucch'[341] (the "&-mcans" algorithm). 
These algorithms were developed for statistical clustering 
applications, the selection of a finite collection of templates 
that well represent a large collection of data in the MSE 
sense, i.e., a fixed-rate VQ with an MSE distortion mea- 
sure in quantization terminology, cf. Andcrbcrg [9], Harti- 
gan [238], or Diday and Simon [133]. MacQuccn used an 
incremental incorporation of successive samples of a train- 
ing set to design the codes, each vector being first mapped 
into a minimum distortion reproduction level representing 
a cluster, and then the level for that cluster being replaced 
by an adjusted ccntroid. Forgcy and Janccy used simulta- 
neous updates of all ccntroids, as did Stcinhaus and Lloyd. 

Unfortunately many of these early results did not propa- 
gate among the diverse groups working on similar prob- 
lems. Zador's extensions of Lloyd's results were little 
known outside of Bell Laboratories. The work of Stcinhaus 
has been virtually unknown in the quantization community 
until recently. The work in the clustering community on 
what were effectively vector quantizer design algorithms 
in the context of statistical clustering was little known at 
the time in the quantization community, and it was not 
generally appreciated that Lloyd's algorithm was in fact a 
clustering algorithm. Part of the lack of interest through 
the 1950's was likely due to the fact that there had not yet 
appeared any strong motivation to consider the quantiza- 
tion of vectors instead of scalars. This motivation came 
as a result of Shannon's landmark 1959 paper on source 
coding with a fidelity criterion. 

E. Jmplcmcntablc Vector Quantizers 

As mentioned before, it was not evident from the earliest 
studies that vector quantization could be a practical tech- 
nique. The only obvious encoding procedure is brute force 
nearest neighbor encoding: compare the source vector to 
be quantized with all reproduction vectors in the codebook. 



Since a (fixed-rate) VQ with dimension k and rate R has 
2 kR codcvcctors, the number of computations required to 
do this grows exponentially with the dimension- rate prod- 
uct kR, and gets quickly out of hand. For example, if 
k = 10 and R = 2, there arc roughly one million codcvcc- 
tors. Moreover, these codcvcctors need to be stored, which 
also consumes costly resources. Finally, the proof of Shan- 
non's source coding theorem relics on the dimension becom- 
ing large, suggesting that large dimension might be needed 
to attain good performance. As a point of reference, we 
note that in the development of channel codes, for which 
Shannon's theory had also suggested large dimension, it 
was common circa 1970 to consider channel codes with di- 
mensions on the order of 100 or more. Thus, it no doubt 
appeared to many that similarly large dimensions might 
be needed for effective quantization. Clearly a brute force 
implementation of VQ with such dimensions would be out 
of the question. On the other hand, the channel codes of 
this era with large dimension and good performance, e.g. 
BCH codes, were highly structured so that encoding and. 
decoding need not be done by brute force. 

From the above discussion, it should not be surprising-' 
that the first VQ intended as a practical technique had a 
reproduction codebook that was highly structured in or- 
der to reduce the complexity of encoding and decoding. 
Specifically, we speak of the fixed-rate vector quantizer in- 
troduced in 1965 by Dunn [137] for multidimensional i.i.d. 
Gaussian vectors. He argued that his code was effectively a 
permutation code as earlier used by Slcpian [472] for chan- 
nel coding, in that the reproduction codebook contains only 
codcvcctors that arc permutations of each other. This leads 
to a quantizer with reduced (but still fairly large) complex- 
ity. Dunn compared numerical computations of the perfor- 
mance of this scheme to the Shannon rate-distortion func- 
tion. As mentioned earlier, this was the first such compari-. 
son. In 1972 Bcrgcr, Jclinck and Wolf [49], and Bcrgcr [47] 
introduced lower complexity encoding algorithms for per- 
mutation codes, and Bcrgcr [47] showed that for large di- 
mensions, the operational distortion-rate function of per- 
mutation codes is approximately equal to that of optimal 
variable-rate scalar quantizers. While they do not attain 
performance beyond that of scalar quantization, permuta- 
tion codes have the advantage of avoiding the buffering and 
error propagation problems of variable-rate quantization. 

Notwithstanding the skepticism of some about the fea- 
sibility of brute force, unstructured vector quantization, 
serious studies of such began to appear in the mid-1970's, 
when several independent results were reported describing 
applications of clustering algorithms, usually it-means, to 
problems of vector quantization. In 1974-1975 Chaffee [76] 
and Chaffee and Omura [77] used clustering ideas to de- 
sign a vector quantizer for very low rate speech vocoding. 
In 1977 Hilbcrt used clustering algorithms for joint image 
compression and image classification [242]. These papers 
appear to be the first applications of direct vector quanti- 
zation for speech and image coding applications. Also in 
1977, Chen used an algorithm equivalent to a 2-dirncnsional 
Lloyd algorithm to design 2-dimcnsional vector quantiz- 
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crs [87]. 

In 1978 and 1979 a vector extension of Lloyd's Method 
I was applied to linear predictive coded (LPC) speech pa- 
rameters by Buzo and others [220], [67], [68], [223] with a 
weighted quadratic distortion measure on parameter vec- 
tors closely related to the Itakura-Saito spectral distor- 
tion measure [258], [259], [257]. Also in 1978, Adoul, 
Collin, and Dalle [3] used clustering ideas to design two- 
dimensional vector quantizers for speech coding. Caprio, 
Wcstin, and Esposito in 1978 [74] and Mcncz, Bocri, and 
Estcban in 1979 [353] also considered clustering algorithms 
for the design of vector quantizers with squarcd-crror and 
magnitude-error distortion measures. 

The most important paper on quantization during the 
1970's was without a doubt Gcrsho's paper on "Asymptot- 
ically optimal block quantization" [193]. The paper pop- 
ularized high resolution theory and the potential perfor- 
mance gains of vector quantization, provided new, sim- 
plified variations and proofs of Zador's results and vector 
extensions of Gish and Pierce's results with squarcd-crror 
distortion, and introduced lattice vector quantization as a 
means of achieving the asymptotically optimal quantizer 
point density for entropy constrained vector quantization 
for a random vector with bounded support. The simple 
derivations combined the vector quantizer point density ap- 
proximations with the use of Holder's and Jensen's inequal- 
ities, generalizing a scalar quantizer technique introduced 
in 1977 [222]. One step of the development rested on a still 
unproved conjecture regarding the asymptotically optimal 
quantizer cell shapes and Zador's constants, a conjecture 
which since has borne Gcrsho's name and which will be 
considered at some length in Section IV. Portions of this 
work were extended to nondecrcasing functions of norms 
in [554]. 

Gcrsho's work stimulated renewed interest in the theory 
and design of direct vector quantizers and demonstrated 
that, contrary to the common impression that very large di- 
mensions were required, significant gains could be achieved 
over scalar quantization by quantizing vectors of modest di- 
mension and, as a result, such codes might be competitive 
with predictive and transform codes in some applications. 

In 1980 Lindc, Buzo, and Gray explicitly extended 
Lloyd's algorithm to vector quantizer design [318]. As we 
have seen, the clustering approach to vector quantizer de- 
sign originated years earlier, but the Lindc ct al. paper 
introduced it as a direct extension to the original Lloyd op- 
timal PCM design algorithm, extended it to more general 
distortion measures than had been previously considered 
(including an input- weighted quadratic distortion useful in 
speech coding), and succeeded in popularizing the algo- 
rithm to the point that it is often referred to as the "LBG 
algorithm." A "splitting" method for designing the quan- 
tizer from scratch was developed, wherein one first designs 
a quantizer with two words (2-mcans), then doubles the 
codebook size by adding a new codcvcctor near each ex- 
isting codcvcctor, then runs Lloyd's algorithm again, and 
so on. The numerical examples of quantizer design com- 
plemented Gcrsho's high-resolution results much as Lloyd's 



had complemented Pantcr and Ditc: it was shown that even 
with modest dimensions and modest rates, significant gains 
over scalar quantization could be achieved by direct vector 
quantization of modest complexity. Later in the same year, 
Buzo ct al. [69], developed a tree-structured vector quan- 
tizer (TSVQ) for 10-dimcnsional LPC vectors that greatly 
reduced the encoder complexity from exponential growth 
with codebook size to linear growth by searching a sequence 
of small codebooks instead of a single large codebook. The 
result was an 800 bits per second LPC speech coder with 
intelligible quality comparable to that of scalar quantized 
LPC speech coders of four times the rate. (Sec also [538].) 
In the same year Adoul, Dcbray, and Dalle [4] also used a 
spectral distance measure to optimize predictors for DP CM 
and the first thorough study of vector quantization for im- 
age compression was published by Yamada, Fujita, and 
Tazaki [551]. 

In hindsight, the surprising effectiveness of low dimen- 
sional VQ, e.g. k = 2 to 10, can be explained by the 
fact that in Shannon's theory large dimension is needed to 
attain performance arbitrarily close to the ideal. In chan- 
nel coding at rates less than capacity, ideal performance 
means zero error probability, and large dimension is needed 
for codes to approach this. However when quantizing at 
a given rate R, ideal performance means distortion equal 
to 6(R). Since this is not zero, there is really no point 
to making the difference between actual and ideal perfor- 
mance arbitrarily small. For example, it might be enough 
to come within 5 to 20% (.2 to .8 dB) of 6(R), which docs 
not require terribly large dimension. We will return to this 
in Section IV with estimates of the required dimension. 

There followed an active period for all facets of quanti- 
zation theory and design. Many of these results developed 
early in the decade were fortuitously grouped in the March 
1982 special issue on Quantization of these Transactions, 
which published the Bell Laboratories Technical Memos 
of Lloyd, Newman and Zador along with Bcrgcr's exten- 
sion of the optimality properties of entropy-constrained 
scalar quantization to rth power distortion measures and 
his extensive comparison of minimum entropy quantizers 
and fixed-rate permutation codes [48], generalizations by 
Trushkin of Fleischer's conditions for uniqueness of local 
optima [503], results on the asymptotic behavior of Lloyd's 
algorithm with training sequence size based on the theory 
of ib-mcans consistency by Pollard [418], two seminal papers 
on lattice quantization by Conway and Sloanc [103], [104], 
rigorous developments of the Bennett theory for vector 
quantizers and rth power distortion measures by Bucklcw 
and Wise [64], Kicffcr's demonstration of stochastic stabil- 
ity for a general class of feedback quantizers including the 
historic class of predictive quantizers and delta modulators 
along with adaptive generalizations [281], Kicffcr's study 
of the convergence rate of Lloyd's algorithm [280], and the 
demonstration by Garcy, Johnson, and Witscnhauscn that 
the Lloyd-Max optimization was NP-hard [187]. 

Towards the middle of the 1980's, several tutorial articles 
on vector quantization appeared, which greatly increased 
the accessibility of the subject [195], [214], [342], [372]. 
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F. The Mid 1980's to the Present 

In the middle to late 1980's a wide variety of vector 
quantizer design algorithms were developed and tested for 
speech, images, video, and other signal sources. Some of 
the quantizer design algorithms developed as alternatives 
to Lloyd's algorithm include simulated annealing [140], 
[507], [169], [289], deterministic annealing [445], [446], 
[447], pairwisc nearest neighbor [146] (which had its ori- 
gins in earlier clustering techniques [524]), stochastic relax- 
ation [567], [571], self organizing feature maps [290], [544], 
[545] and other neural nets [495], [301], [492], [337], [65]. A 
variety of quantization techniques were introduced by con- 
straining the structure of the vector quantization to better 
balance complexity with performance and these methods 
were applied to real signals (especially speech and images) 
as well as to random sources, which permitted compari- 
son to the theoretical high-resolution and Shannon bounds. 
The literature begins to grow too large to cite all works 
of possible interest, but several of the techniques will be 
considered in Section V. Here, we only mention several 
examples with references and leave further discussion to 
Section V. 

As will be discussed in some depth in Section 5, fast 
search algorithms were developed for unstructured repro- 
duction codebooks, and even faster searches for reproduc- 
tion codebooks constrained to have a simple structure, 
for example to be a subset of points of a regular lat- 
tice as in a lattice vector quantizer. Additional structure 
can be imposed for faster searches with virtually no loss 
of performance, as in Fisher's pyramid VQ [164], which 
takes advantage of the asymptotic equip artition property 
to choose a structured support region for the quantizer. 
Tree-structured VQ uses a tree-structured reproduction 
codebook with a matched tree-structured search algorithm. 
A tree-structured VQ with far less memory is provided by 
a multistage or residual VQ. A variety of product vector 
quantizers use a cartesian product reproduction codebook, 
which often can be rapidly searched. Examples include po- 
lar vector quantizers, mean-removed vector quantizers, and 
shape-gain vector quantizers. Trellis encoders and trellis- 
coded quantizers use a Vitcrbi algorithm encoder matched 
to a reproduction codebook with a trellis structure. Hier- 
archical table-lookup vector quantizers provide fixed-rate 
vector quantizers with minimal computational complexity. 
Many of the early quantization techniques, results, and ap- 
plications can be found in original form in Swaszck's 1985 
reprint collection on quantization [484] and Abut's 1990 
IEEE Reprint Collection on Vector Quantization [2]. 

We close this section with a brief discussion of two spe- 
cific works which deal with optimizing variable-rate scalar 
quantizers without additional structure, the problem that 
leads to the general formulation of optimal quantization in 
the next section. In 1984 Farvardin and Modcstino [155] 
extended Bcrgcr's [47] necessary conditions for optimality 
of an entropy-constrained scalar quantizer to more gen- 
eral distortion measures and described two design algo- 
rithms: the first is similar to Bcrgcr's iterative algorithm, 
but the second was a fixed-point algorithm which can be 



considered as a natural extension of Lloyd's Method I from 
fixed-rate to variable-rate vector quantization. In 1989 
Chou ct al. [93] developed a generalized Lloyd algorithm 
for entropy constrained vector quantization that general- 
ized Bcrgcr's [47], [48] Lagrangian formulation for scalar 
quantization and Farvardin and Modcstino's fixed-point 
design algorithm [155] to vectors. Optimality properties 
for minimizing a Lagrangian distortion D(q) + XR(q) were 
derived, where rate could be cither average length or en- 
tropy. Lloyd's optimal decoder remained unchanged and 
the lossless code is easily seen to be an optimal lossless code 
for the encoded vectors, but this formulation shows that 
the optimal encoder must simultaneously consider both the 
distortion and rate resulting from the encoder. In other 
words, quantizers with variable-rate should use an encoder 
that minimizes a sum of squared error and weighted bit 
rate, and not only the squared error. Another approach 
to entropy-constrained scalar quantization is described in 
[285]. 

This is a good place to again mention Gish and Pierce's 
result that if the rate is high, optimal entropy-constrained 
scalar or vector quantization can provide no more than 
roughly 1/4 bit improvement over uniform scalar quanti- 
zation with block entropy coding. Bcrgcr [47] showed that 
permutation codes achieved roughly the same performance 
with a fixed-rate vector quantizer. Ziv [578] showed in 1985 
that if subtractivc dithering is allowed, dithered uniform 
quantization followed by block lossless encoding will be at 
most .754 bits worse than the optimal entropy constrained 
vector quantizer with the same block size, even if the rate is 
not high. (Subtractivc dithering, as will be discussed later, 
adds a random dither signal to the input and removes it 
from the decompressed output.) As previously discussed, 
these results do not eliminate the usefulness of fixed-rate 
quantizers, because they may be simpler and avoid the dif- 
ficulties associated with variable-rate codes. These results 
do suggest, however, that uniform quantization and loss- 
less coding is always a candidate and a benchmark for per- 
formance comparison. It is not known if the operational 
distortion-rate function of variable-rate quantization with 
dithering is better than that without dithering. 

The present decade has seen continuing activity in de- 
veloping high resolution theory and design algorithms for 
a variety of quantization structures, and in applying many - 
of the principles of the theory to optimizing signal process- 
ing and communication systems incorporating quantizers. 
As the arrival of the present is a good place to close our 
historical tour, many results of the current decade will be 
sketched through the remaining sections. It is difficult to 
resist pointing out, however, that in 1990 Lloyd's algorithm 
was rediscovered in the statistical literature under the name 
of "principal points," which arc distinguished from tradi- 
tional k- means by the assumption of an absolutely contin- 
uous distribution instead of an empirical distribution [171], 
[496], a formulation included in the VQ formulation for a 
general distribution. Unfortunately, these works reflect no 
awareness of the rich quantization literature. 

Most quantizers today arc indeed uniform and scalar, 
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but arc combined with prediction or transforms. In many 
niche applications, however, the true vector quantizers, in- 
cluding lattices and other constrained code structures, ex- 
hibit advantages, including the coding of speech residu- 
als in code excited linear predictive (CELP) speech cod- 
ing systems and VXTrcmc/ Microsoft streaming video in 
WcbThcatcr. Vector quantization, unlike scalar quantiza- 
tion, is usually applied to digital signals, e.g., signals that 
have already been "finely" quantized by an A/D converter. 
In this case quantization (vector or scalar) truly represents 
compression since it reduces the number of bits required to 
describe a signal and it reduces the bandwidth required to 
transmit the signal description if an analog link is used. 

Modern video coding schemes often incorporate the La- 
grangian distortion viewpoint for accomplishing rate con- 
trol, while using predictive quantization in a general sense 
through motion compensation and uniform quantizers with 
optimized lossless coding of transform coefficients for the 
intraframc coding (cf. [201], [202]). 

III. Quantization Basics: Encoding, Rate, 
Distortion, and Optimality 

This section presents, in a self contained manner, the ba- 
sics of memory less quantization, that is, vector quantizers 
which operate independently on successive vectors. For 
brevity, we omit the "mcmorylcss" qualifier for most of 
the rest of this section. A key characteristic of any quan- 
tizer is its dimension k, a positive integer. Its input is a 
Ar-dimcnsional vector «=(«!,..., x k ) from some alphabet 
A C 5ft*. (Abstract alphabets arc also of interest in rate 
distortion theory, but virtually all alphabets encountered 
in quantization arc real- valued vector spaces, in which case 
the alphabet is often called the support of the source dis- 
tribution.) If k = 1 the quantizer is scalar, otherwise it is 
vector. In any case, the quantizer consists of three compo- 
nents — a lossy encoder a : A — ► 2, where the index set J 
is an arbitrary countable set, usually taken as a collection 
of consecutive integers, a reproduction decoder /3 : 1 — ► A, 
where A C 5ft* is the reproduction alphabet, and a lossless 
encoder 7 : 1 — * J, an invcrtiblc mapping (at least with 
probability 1) into a collection J of variable-length binary 
vectors that satisfies the prefix condition. Alternatively, a 
lossy encoder is specified by a partition S = {Si; i £ 1} 
of A, where Si = [x : a(x) = i}; a reproduction decoder 
is specified by a (reproduction) codebook C = {/?(«); i € T) 
of points, codcvcctors or reproduction codewords; and the 
lossless encoder 7 can be described by its binary codebook 
J = {7(t); i £l) containing binary or channel codewords. 
The quantization rule is the function q(x) = f3(a(x)) or, 
cquivalcntly, q(x) = /?(t) whenever x £ S{. 

A ^-dimensional quantizer is used by applying its lossy 
and lossless encoders, followed by the corresponding de- 
coders, to a sequence of ^-dimension al input vectors 
{z n 5 " = 1,2,...} extracted from the data being encoded. 
There is not a unique way to do such vector extraction; and 
the design and performance of the quantizer usually depend 
significantly on the specific method that is used. For data 
that naturally forms a sequence x\ % x 2} ... of scalar-valued 



samples, e.g. speech, vector extraction is almost always 
done by parsing the data into successive ^-tuples of adja- 
cent samples, i.e., x^ = (x( n „ X ) k + u . . x nk ). As an ex- 
ample of other possibilities, one could also extract the first 
k even samples, followed by the first k odd samples, the 
next k even samples, and so on. This subsampling could 
be useful for a multircsolution reconstruction, as in inter- 
polate vector quantization [234], [194]. For other types 
of data there may be no canonical extraction method. For 
example, in stereo speech the ^-dimensional vectors might 
consist just of left samples, or just of right samples, or half 
from each, or k from the left followed by k from the right, 
etc. Another example is grayscale imagery where the k- 
dimcnsional vectors might come from parsing the image 
into rectangular ra-by-n blocks of pixels, where run — k, 
or into other tiling polytopcs, such as hexagons and other 
shapes aimed at taking advantage of the eye's insensitivity 
to noise along diagonals in comparison with along horizon- 
tal and vertical lines [226]. Or the vectors might come from 
some less regular parsing. If the image has color, with each 
pixel value represented by some three-dimensional vector, 
then ^-dimensional vectors can be extracted in even more 
ways. And if the data is a sequence of color of images, 
e.g. digital video, the extraction possibilities increase im- 
mensely 3 . 

There arc two generic domains in which (mcmorylcss) 
quantization theory, both analysis and design, can proceed. 
In the first, which we call the random vector domain, the 
input data, i.e. source, to be quantized is described by a 
fixed value of k y an alphabet A C 5ft* and a probability dis- 
tribution on A; and the quantizer must be ^-dimensional. 
This is the case when specific vector dimension and con- 
tents arc not allowed to vary, e.g., when 10- dimensional 
speech parameter vectors of line spectral pairs or reflection 
coefficients arc coded together. In the second, which we 
call the random process domain, the input data is char- 
acterized as a discrete parameter random process, i.e. a 
countable collection (usually infinite) of random variables; 
and different ways of extracting vectors from its compo- 
nent variables may be considered and compared, including 
different choices of the dimension k. As indicated above, 
there arc in general many ways to do this. However, for 
concrctcncss and because it provides the opportunity to 
make some key points, whenever the random process do- 
main is of interest in this and the next section, wc focus 
exclusively on the canonical case where the data naturally 
forms a one-dimensional, scalar- valued sequence, and suc- 
cessive ^-tuples of adjacent samples arc extracted for quan- 
tization. Wc will also assume that the random process is 
stationary, unless a specific exception is made. Stationary 
models can easily be defined to include processes that ex- 
hibit distinct local and global stationarity properties (such 
as speech and images) by the use of models such as compos- 
ite, hidden Markov, and mixture sources. In the random 
vector domain, there is no first order stationarity assump- 

3 For example, the video community has had a longstanding debate 
between progressive vs. interlaced scanning — two different extrac- 
tion methods. 
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tion; e.g. 3 the individual components within each vector 
need not be identically distributed. In cither domain we 
presume that the quantizer operates on a fc- dimensional 
random vector X = (X\ } ... t Xk) z usually assumed to be 
absolutely continuous so that it is described by a proba- 
bility density function (pdf) f(x). Densities arc usually 
assumed to have finite variance in order to avoid technical 
difficulties. 

Memory less quantizers, as described here, arc also re- 
ferred to as Manilla" vector quantizers or block source 
codes. The alternative is a quantizer with memory. Mem- 
ory can be incorporated in a variety of ways; it can be 
used separately for the lossy encoder (for example different 
mappings can be used, conditional on the past) or for the 
lossless encoder (the index produced by a quantizer can be 
coded conditionally based on previous indices). We shall 
return to vector quantizers with memory in Section V, but 
our primary emphasis will remain on mcmorylcss quantiz- 
ers. We will occasionally use the term code as a generic 
substitute for quantizer. 

The instantaneous rate of the quantizer applied to a par- 
ticular input is the normalized length r(z) = ^l(7{&(%))) of 
the channel codeword, the number of bits per. source sym- 
bol that must be sent to describe the reproduction. An 
important special case is when all binary codewords have 
the same length r, in which case the quantizer is referred 
to as fixed-length or fixed-rate. 

To measure the quality of the reproduction, we assume 
the existence of a nonncgativc distortion measure d(x,x) 
which assigns a distortion or cost to the reproduction of 
input x by x. Ideally one would like a distortion measure 
that is easy to compute, useful in analysis, and perceptually 
meaningful in the sense that small (large) distortion means 
good (poor) perceived quality. No single distortion measure 
accomplishes all three goals, but the common squared error 
distortion 

k 

d(x, x) = \\x - x|| 2 = (x- xY(x - *) = £ \*i - ^l 2 

1=1 

satisfies the first two. Although much maligned for lack 
of perceptual mcaningfulncss, it often is a useful indicator 
of perceptual quality and, perhaps more importantly, it 
can be generalized to a class of distortion measures that 
have proved useful in perceptual coding, the input weighted 
quadratic distortion measures of the form 



d{x,x) = {x-x) t W x {x-x), 



(21) 



where W x is a positive definite matrix that depends on 
the input^ cf. [258], [259], [257], [224], [387], [386], [150], 
[186], [316], [323], [325]. Most of the theory and design 
techniques considered here extend to such measures, as will 
be discussed later. We also assume that d(x, x) = 0 if and 
only if x = x , an assumption that involves no genuine loss 
of generality and allows us to consider a lossless code as a 
code for which d(x, f3(ce(x))) = 0 for all inputs x. 

There exists a considerable literature for various other 
distortion measures, including l p and other norms of dif- 
ferences and convex or nondecrcasing functions of norms 



of differences. These have rarely found application in real 
systems, however, so our emphasis will be on the MSE with 
comments on generalizations to input- weigh ted quadratic 
distortion measures. 

The overall performance of a quantizer applied to a 
source is characterized by the normalized rate 

R(a,y) = E[r(X)} = \E[l(7(a(X)))] 



= JE'W))/ /(*)<*•'. 

• J Si 



and the normalized average distortion 

D(a,/9) = lE[d(X,/3(a(X)))] 

= lY,J s d(x, yi )f(x)dx. 

Every quantizer (a, 7,/?) is thus described by a rate- 
distortion pair (R(a } 7), D(a } /?)). The goal of compression 
system design is to optimize the rate-distortion tradeoff. 
Fixed-rate quantizers constrain this optimization by not al- 
lowing a code to assign fewer bits to inputs that might ben- 
efit from such, but they provide simpler codes that avoid 
the necessity of buffering in order to match variable-rate 
codewords to a possibly fixed-rate digital channel. 

The optimal rate- distortion tradeoff for a fixed dimen- 
sion k can be formalized in several ways: by optimizing 
distortion for a constrained rate, by optimizing rate for a 
constrained distortion, or by an unconstrained optimiza- 
tion using a Lagrange approach. These approaches lead 
respectively to the operational distortion-rate function 

6(R)= inf £>(«,/?), 

the operational rate- distortion function 

r(D)= inf fl(a,7), 

and the operational Lagrangian or weighted distortion-rate 
function 

L(\)= ^nf^D(a,p) + \R(a,y), 

where A is a nonncgativc number. A small value of A leads 
to a low distortion, high rate solution and a large value 
leads to a low rate, high distortion solution. Note that 

D(a,0) + XR(a,y) = E[d(XJ(a(X)) + M(7(a(X)))), 

so that the bracketed term can be considered to be a modi- 
fied or Lagrangian distortion, and that L(A) is the smallest 
average Lagrangian distortion. All of these formalizations 
of optimal performance have their uses, and all arc essen- 
tially equivalent: the distortion-rate and rate-distortion 
functions arc duals and every distortion-rate pair on the 
convex hull of these curves corresponds to the Lagrangian 
for some value of A. Note that if one constrains the problem 
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to fixed-rate codes, then the Lagrangian approach reduces 
to the distortion rate approach since R(a, 7) no longer de- 
pends on the code and 7 can be considered as just a binary 
indexing of J. 

Formal definitions of quantizer optimality easily yield 
optimality conditions as direct vector extensions and vari- 
ations on Lloyd's conditions. The conditions all have a 
common flavor: if two components of the code (a, 7, ff) arc 
fixed, then the third component must have a specific form 
for the code to be optimal. The resulting optimality prop- 
erties arc summarized below. The proofs arc simple and 
require no calculus of variations or differentiation. Proofs 
may be found, e.g., in [94], [196]. 

• For a fixed lossy encoder a, regardless of the lossless en- 
coder 7, the optimal reproduction decoder /? is given by 

= argmin£[d(X, y)| a{X) = t], 
y 

the output minimizing the conditional expectation of the 
distortion between the output and the input given that the 
encoder produced index i. These vectors arc called the 
Lloyd ccntroids. Note that the optimal decoder output for 
a given encoder output i is simply the optimal estimate of 
the input vector X given a(X) — i in the sense of mini- 
mizing the conditional average distortion. If the distortion 
is squared error, the reproduction decoder is simply the 
conditional expectation of X given it was encoded into i: 

ccntvoid(Si) = E[X\X £ Si]. 

If the distortion measure is the input- weighted squared er- 
ror of (21), then [318], [224] 

ccntroid(Si) = E[Wx\X € Si]- l E[W x X\X G 

• For a fixed lossy encoder a, regardless of the reproduction 
decoder /?, the optimal lossless encoder 7 is the optimal 
lossless code for the discrete source at(X), e.g., a Huffman 
code for the lossy encoded source. 

• For a fixed reproduction decoder /?, lossless code 7 and 
Lagrangian parameter A, the optimal lossy encoder is a 
minimum distortion (nearest neighbor) encoder for the 
modified Lagrangian distortion measure: 

a(x) = argmin(d(*,/?(t)) + A/(7(t))). 

If the code is constrained to be fixed-rate, then the sec- 
ond property is irrelevant and the third property reduces 
to the familiar minimum distortion encoding with respect 
to tf, as in the original formulation of Lloyd (and implicit 
in Shannon). (The resulting partition is often called a 
Voronoi partition.) In the general variable-rate case, the 
minimum distance (with respect to the distortion measure 
d) encoder is suboptimal; the optimal rule takes into ac- 
count both distortion and codeword length. Thus simply 
cascading a minimum MSE vector quantizer with a lossless 
code is suboptimal. Instead, in the general case instanta- 
neous rate should be considered in an optimal encoding, as 
the goal is to trade off distortion and rate in an optimal 



fashion. In all of these cases the encoder can be viewed as 
a mechanism for controlling the output of the decoder so 
as to minimize the total Lagrangian distortion. 

The optimality conditions imply a descent algorithm for 
code design: Given some A, begin with an initial code 
(a, /?, 7). Optimize the encoder a for the other two compo- 
nents, then optimize the reproduction decoder ft for the re- 
maining components, then optimize the lossless coder 7 for 
the remaining components. Let T denote the overall trans- 
formation resulting from these three operations. One such 
iteration of T must decrease or leave unchanged the aver- 
age Lagrangian distortion. Iterate until convergence or the 
improvement falls beneath some threshold. This algorithm 
is an extension and variation on the algorithm for opti- 
mal scalar quantizer design introduced for fixed-rate scalar 
quantization by Lloyd [330]. The algorithm is a fixed-point 
algorithm since if it converges to a code, the code must be 
a fixed point with respect to T. This generalized Lloyd al- 
gorithm applies to any distribution, including parametric 
models and empirical distributions formed from training 
sets of real data: There is no obvious means of choosing 
the "best" A, so the design algorithm might sweep through 
several values to provide a choice of rate-distortion pairs. 
We also mention that Lloyd style iterative algorithms have 
been used to design many structured forms of quantiza- 
tion. For example, when the codes are constrained to have 
fixed rate, the algorithm becomes fc-mcans clustering, find- 
ing a fixed number of representative points that yield the 
minimum average distortion when a minimum distortion 
mapping is assumed. 

As mentioned in Section I, a variety of other clustering 
algorithms exist that can be used to design vector quantiz- 
ers (or solve any other clustering problems). Although each 
has found its adherents, none has convincingly yielded sig- 
nificant benefits over the Lloyd algorithm and its variations 
in terms of trading off rate and distortion, although some 
have proved much faster (and others much slower). Some 
algorithms such as simulated and deterministic annealing 
have been found experimentally to do a better job of avoid- 
ing local optima and finding globally optimal distortion- 
rate pairs than has the basic Lloyd algorithm, but repeated 
applications of the Lloyd algorithm with different initial 
conditions has also proved effective in avoiding local op- 
tima. We focus on the Lloyd algorithm because of its sim- 
plicity, its proven merit at designing codes, and because 
of the wealth of results regarding its convergence proper- 
tics [451], [418], [108], [91], [101], [321], [335], [131], [36]. 

The ccntroid property of optimal reproduction decoders 
has interesting implications in the special case of a squarcd- 
crror distortion measure, where it follows easily [137], [60], 
[193], [184], [196] that 

• E[q(X)] = E[X], so that the quantizer output can be 
considered as an unbiased estimator of the input. 

♦ E[qi(X)(qj(X)-Xj)] = 0, for all t, j so that each compo- 
nent of the quantizer output is orthogonal to each compo- 
nent of the quantizer error. This is an example of the well 
known fact that the minimum mean squared error estimate 
of an unknown, X, given an observation, oc(X), causes the 
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estimate to be orthogonal to the error. In view of the 

previous property, this implies that the quantizer error is 

uncorrclatcd with the quantizer output rather than, as is 

often assumed, with the quantizer input. 

. SOW*) -X|| 2 ] = £[|Mn-£[|k(*)ll*l, w hich implies 

that the energy (or variance) of the quantized signal must 

be less than that in the original signal. 

. £[**(gpO-Jf)] =-E[\\q(X)-X\\ 2 l which shows that 

the quantizer error is not uncorrclatcd with the input. In 

fact the correlation is minus the mean squared error. 

It is instructive to consider the extreme points of the 
rate- distortion tradeoff, when the distortion is zero (or A = 
0) and the rate is 0 (when A = co). First suppose that 
A = 0. In this case the rate docs not affect the Lagrangian 
distortion at all, but MSE counts. If the source is discrete, 
then one can optimize this case by forcing zero distortion, 
that is, using a lossless code. In this case Shannon's lossless 
coding theorem implies that for rate measured by average 
instantaneous codclcngth, 

H(X)<r(0)<H(X) + l, 

or, if rate is measured by entropy, then simply r(0) = 
H(X), the entropy of the vector. In terms of the La- 
grangian formulation, L(0) = 0. Conversely, suppose that 
A — ► oo. In this case distortion costs a negligible amount 
and. rate costs an enormous amount, so here the optimal 
is attained by using zero rate and simply tolerating what- 
ever distortion one must suffer. The distortion for a zero 
rate code is minimized by the ccntroid of the unconditional 
distribution, 

D(Q) = mmE[d(X,y)] } 

which is simply the mean E[X] in the MSE case. Here the 
Lagrangian formulation becomes L(oo) = min^ E[d(X, y)]. 
Both of these extreme points arc global optima, albeit the 
second is useless in practice. 

So far we have focused on the random vector domain 
and considered optimality for quantizers of a fixed dimen- 
sion. In practice, however, and in source coding theory, the 
dimension k may be a parameter of choice, and it is of inter- 
est to consider how the optima depend on it. Accordingly, 
wc now focus on the random process domain, assuming 
that the source is a onc-dirncnsional, scalar- valued, sta- 
tionary random process. In this situation, the various op- 
erational optima explicitly note the dimension, e.g., 6k(R) 
denotes the operational distortion-rate function for dimen- 
sion k and rate R and similarly rk(D) and ijb(A) denote the 
operational rate- distortion and Lagrange functions. More- 
over, the overall optimal performance for all quantizers of 
rate less than or equal to R is defined by 

8{R) = inf 8 k (R). (22) 

Similar definitions hold for the rate vs. distortion and the 
Lagrangian viewpoints. 

Using stationarity, it can be shown (cf. [562], [577], [221], 
Lemma 11.2.3 of [217]) that the operational distortion- rate 



function is subadditive in the sense that for any positive 
integers k and / 

W*) < kTjhiR) + kTl s i( R )> ( 23 ) 

which shows the generally decreasing trend of the 5jb(A) 9 s 
as k increases. It is not known whether or not 8k+\(R) 
is always less than or equal to 6k(R); However, it can be 
shown that subadditivity implies (cf. [180], p. 112) 

8(R)= lim 6 h {R). (24) 

Jb— »oo 

Hence high dimensional quantizers can do as well as any 
quantizer. Note that (23) and (24) both hold for the special 
cases of fixed-rate quantizers as well as for variable-rate 
quantizers. 

It is important to point out that for squared error and 
most other distortion measures, the "inf 5 in (22) is not a 
"min". Specifically, 8(R) represents performance that can- 
not be achieved exactly, except in degenerate situations 
such as when R = 0 or the source distribution is discrete 
rather than continuous. Of course, by the infimum def- 
inition of 8(R), there arc always quantizers with perfor- 
mance arbitrarily close to it. Wc conclude that no quan- 
tizers arc truly optimal. Thus, it is essential to understand 
that whenever the word "optimal" is used in the random 
process domain, it is always in the context of some spe- 
cific constraint or class of quantizers, such as 8-dirncnsional 
fixed-rate VQ or entropy-constrained uniform scalar quan- 
tization or pyramid coding with dimension 20, to name 
a few at random. Indeed, though desirable, "optimality" 
loses a bit of its lustre when one considers the fact that 
an optimal code in one class might not work as well as 
a suboptimal code in another. It should now be evident 
that the importance of the Lloyd-style optimality principles 
lies ultimately in their ability to guide the optimization of 
quantizers within specific constraints or classes. 

IV. High Resolution Quantization Theory 

This section presents an overview of high resolution the- 
ory and compares its results to those of Shannon rate dis- 
tortion theory. For simplicity, wc will adopt squared error 
as the distortion measure until late in the section where ex- 
tensions to other distortion measures arc discussed. There 
have been two styles of high resolution theory develop- 
ments: informal, where simple approximations arc made, 
and rigorous, where limiting formulas arc rigorously de- 
rived. Here, wc proceed with the informal style until later 
when the results of the rigorous approach arc summarized. 
Wc will also presume the "random vector domain" of fixed 
dimension, as described in the previous section, until stated 
otherwise. 

A. Asymptotic Distortion 

As mentioned earlier, the first and most elementary re- 
sult in high resolution theory is the A 2 /12 approximation 
to the mean squared error of a uniform scalar quantizer 
with step size A [468], [394], [43], which wc now derive. 
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Consider an N-lcvcl uniform quantizer q whose levels arc 
yii- iVN) with t/i = 4- A. When this quantizer is 
applied to a continuous random variable X with proba- 
bility density f{x), when A is small, and when overload 
distortion can be ignored, the mean squared error (MSE) 
distortion may be approximated as follows: 

D(q) = E[(X- q (X))>) 

N ,!/;+A/2 

= £ / {x- yi ) 2 f{x)dx 
= £/(w) / {*-yi?dx 

A 2 N 

= ^ £/(»)* 

= 77T / /(*)<** 
^ Jyi-A/2 

12 * 

The first approximation in the above derives from ignoring 
overload distortion. If the source density is entirely con- 
tained in the granular region of the quantizer, then this 
approximation is not needed. The second approximation 
derives from observing that the density may be approxi- 
mated as a constant on a small interval. Usually, as in the 
mean value theorem of integration, one assumes the density 
is continuous, but as any measurable function is approxi- 
mately continuous, when A is sufficiently small this ap- 
proximation is valid even for discontinuous densities. The 
third approximation derives from recognizing that by the 
definition of a Ricmann integral, J2iLi f(y%)^ ls approxi- 
mately equal to the integral of /. Finally, the last approx- 
imation derives from again ignoring the overload region. 
As mentioned in earlier sections, there arc situations, such 
as variable-rate quantization where an infinite number of 
levels arc permitted. In such cases, if the support of the 
uniform scalar quantizer contains that of the source den- 
sity, then there will be no overload distortion to ignore, and 
again we have D = A 2 /12. 

It is important to mention the sense in which D is ap- 
proximated by A 2 /12. After all, when A is small, both D 
and A 2 / 12 will be small, so it is not saying much to assert 
that their difference is small. Rather, as discussed later in 
the context of the rigorous framework for high resolution 
theory, it can be shown that under ordinary conditions, the 
ratio of D and A 2 /12 tends to 1 as A decreases. Though 
we will not generally mention it, all future high resolution 
approximations discussed in this paper will also hold in this 
ratio- tending-to-one sense. 

Each of the assumptions and simple approximations 
made in deriving A 2 / 12 reoccurs in some guise in the 
derivation of all subsequent high-resolution formulas, such 
as for nonuniform, vector and variable- rate quantizers. 
Thus, they might be said to be principal suppositions. In- 
deed the small cell type of supposition is what gives the 
theory its "high resolution" name. 



In uniform quantization, all cells have the same size and 
shape and the levels arc in the center of each cell (except 
for the outermost cells which arc ignored). Thus, the cell 
size A is the key performance determining gross charac- 
teristic. In more, advanced e.g. vector, quantization, cells 
may differ in size and shape, and the codcvcctors need not 
be in the centers of the cells. Consequently, other gross 
characterizations arc needed. These arc the point density 
and the inertia! profile. 

The point density of a vector quantizer is the direct ex- 
tension of the point density introduced in Section II. That 
is, it is a nonncgativc, usually smooth function X(x) that, 
when integrated over a region, determines the approximate 
fraction of codcvcctors contained in that region. In fixed- 
rate coding, the point density is usually normalized by the 
number of codcvcctors so that its total integral is one. In 
variable-rate coding, where the number of codcvcctors is 
not a key performance determining parameter and may 
even be infinite, the point density is usually left unnor- 
malizcd. As we consider fixed-rate coding first' we will 
presume A is normalized, until stated otherwise. There 
is clearly an inverse relationship between the point den- 
sity and the volume of cells, namely, X(x) = (N\o\(S x ))~ l , 
where as before, N is the number of codcvcctors or cells 
and S x denotes the cell containing x. 

As with any density that describes a discrete set of 
points, there is not a unique way to define it for a specific 
quantizer. Rather the point density is intended as a high 
level, gross characterization, or a model or target to which 
a quantizer aspires. It describes the codcvcctors, in much 
the way that a probability density describes a set of data 
points — it docs not say exactly where they arc located, 
but roughly characterizes their distribution. Quantizers 
with different numbers of codcvcctors can be compared on 
the basis of their point density, and there is an ideal point 
density to which quantizers aspire — they cannot achieve 
it exactly, but may approximate it. Nevertheless, there arc 
times when a concrete definition of the point density of a 
specific quantizer is needed. In such cases, the following 
is often used: the specific point density of a quantizer q is 
X q (x) = (TVvol(Sx)) -1 . This picccwisc constant function 
captures all the (fine) detail in the quantizer's partition, in 
contrast to the usual notion of a point density as a gross 
characterization. As an example of its use, we mention that 
for fixed-rate quantization, the ideal point density X(x) is 
usually a smooth function, closely related to the source 
density, and one may say that a quantizer has point den- 
sity approximately X(x) if X q (x) = X(x) for all x in some 
set with high probability (relative to the source density). 
When a scalar quantizer is implemented as a compander, 
X(x) is proportional to the derivative of the compressor 
function applied to the input. Though the notion of point 
density would no doubt have been recognizable to the earli- 
est contributors such as Bennett, Pantcr and Ditc, as men- 
tioned earlier, it was not explicitly introduced until Lloyd's 
work [330]. 

In nonuniform scalar quantization and vector quantiza- 
tion, there is the additional issue of codcvcctor placement 
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within cells and, in the latter case, of cell shape. The effect 
of point placement and cell shape is exhibited in the fol- 
lowing approximation to the contribution of a small cell Si 
with codcvcctor y; to the MSE of a A;- dimensional vector 
quantizer: 

Di(q) = \j s \\x- yi \\>f(x)dx (25) 

= /(y,)M(5, v y,)vol(S i ) 1+2/k 1 (26) 

where M (Si , y,) is the normalized moment of inertia of the 
cell Si about the point y,-, defined by 

Normalizing by volume makes M independent of the size 
of the cell. Normalizing by dimension yields a kind of in- 
variancc to dimension, namely that M(Si x Si, (j/tvSfi)) — 
M(Si } yi). We often write M(Si) when y* is clear from the 
context. The normalized moment of inertia, and the result- 
ing contribution Di(q), is smaller for sphere-like cells with 
codcvcctors in the center than for cells that arc oblong,- 
have sharply pointed vertices, or have displaced codcvcc- 
tors. In the latter cases, there arc more points farther from 
yi that contribute substantially to normalized moment of 
inertia, especially when dimension is large. 

In some quantizers, such as uniform scalar and lattice 
quantizers, all cells (with the exception of the outermost 
cells) have the same shape and the same placement of 
codcvcctors within cells. In other quantizers, however, cell 
shape or codcvcctor placement varies with position. In 
such cases, it is useful to characterize the variation of cell 
normalized moment of inertia by a nonncgativc, usually 
smooth function m(x), called the inertia! profile. That 
is, m(x) = M(Si,yi) when x € Si, As with point den— 
sitics, we do not define m(x) to be equal to M(S x ,q(x)) ) 
because we want it to be a high level gross characteriza- 
tion or model to which a quantizer aspires. Instead we let 
rn q (x) = M(S Xy q(x)) be called the specific inertial profile 
of the quantizer q. This is a picccwisc constant function 
that captures the fine details of cell normalized moment of 
inertia. 

Returning to Di(q) expressed in (26), the effect of cell 
size is obviously in the term vol(Si). Using the inverse 
relationship between point density and cell volume yields 

which shows how point density locally influences distor- 
tion. Summing the above over all cells and recognizing the 
sum as an approximation to an integral yields the following 
approximation to the distortion of a vector quantizer 

For scalar quantizers (k = 1) with points in the middle of 
the cells, m(x) = and the above reduces to 

D ^-j2^Jxk) fiX)dX (28) 



which is what Bennett [43] found for companders, as re- 
stated in terms of point densities by Lloyd [330]. Both (28) 
and the more general formula (27) arc called Bennett's in- 
tcgral. The extension of Bennett's integral to vector quan- 
tizers was first made by Gcrsho (1979) [193] for quantizers 
with congruent cells for which the concept of inertial profile 
was not needed, and then to vector quantizers with varying 
cell shapes (and codcvcctor placements) by Na and Ncuhoff 
(1995) [365]. 

Bennett's integral (27) can be expected to be a good ap- 
proximation under the following conditions: (i) Most cells 
arc small enough that f(x) can be approximated as being 
constant over the cell. (There can be some large cells where 
f(x) is very small.) Ordinarily, this requires N to be large, 
(ii) The specific point density of the quantizer approxi- 
mately equals A(x) on a high probability set of x's. (iii) 
The specific inertial profile approximately equals m(x) on 
a high probability set of x's. (iv) Adjacent cells have sim- 
ilar volumes. The last condition rules out quantizers such 
as a scalar one whose cells have alternating lengths such as 

A, |A, |A, A, £ A, |A, A, .... The point density of such 
a quantizer is X(x) = because there arc 3 points in 
an interval of width 2A. Assuming, for simplicity, that the 
source density is uniform on [0,1], it is easy to compute 
D = ^A 2 , whereas Bennett's integral equals ^A 2 . One 
may obtain the correct distortion by separately applying 
Bennett's integral to the union of intervals of length A and 
to the union of intervals of length |A. The problem is that 
Bennett's integral is not linear in the point density. So for 
it to be accurate, cell size must change slowly or only oc- 
casionally. Since Bennett's integral is linear in the inertial 
profile, it is not necessary to assume that adjacent cells 
have similar shapes, although one would normally expect 
this to be the case in situations where Bennett's integral 
is applied. Examples of the use of the vector extension of 
Bennett's integral will be given later. 

Approximating the source density as a constant over each 
quantization cell, which is a key step in the derivations of 
(26) and (28), is like assuming that the effect of quantiza- 
tion is to add noise that is uniformly distributed. However, 
the range of noise values must match the size and shape of 
the cell. And so when the cells arc not all of the same size 
and shape, such quantization noise is obviously correlated 
with the vector X being quantized. On the other hand, for 
uniform scalar and lattice vector quantizers, the error and 
X arc approximately uncorrected. A more general result, 
mentioned in Section III, is that the correlation between the 
input and the quantization error is approximately equal to 
the MSE of the quantizer when the codcvcctors arc approx- 
imately ccntroids. 

B. Performance of the Best k- Dimensional, Fixed-Rate 
Quantizers 

Having Bennett's integral for distortion, one can hope 
to find a formula for 8k(R)> the operational distortion-rate 
function for k- dimensional, fixed-rate vector quantization, 
by choosing the key characteristics, point density and in- 
ertial profile, to minimize (27). Unfortunately, it is not 
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known how to find the best inertia! profile. Indeed, it 
is not even known what functions arc allowable as incr- 
tial profiles. However, Gcrsho (1979) [193] made the now 
widely accepted conjecture that when rate is large, most 
cells of a ^-dimensional quantizer with rate R and mini- 
mum or nearly minimum MSE arc approximately congru- 
ent to some basic tcsscllating 4 ^-dimensional cell shape T k . 
In this case, the optimum incrtial profile is a constant and 
Bennett's integral can be minimized by variational tech- 
niques or Holder's inequality [222], [193], resulting in the 
optimal point density 



/ fW(x')dx' 



(29) 



and the following approximation to the operational 
distortion-rate function: for large R 



6 k (R)ZM k (3 k * 2 2- 2R = Z k (R), 



(30) 



where M h = M(T k ), which is the least normalized moment 
of inertia of ^-dimensional tcsscllating polytopcs, and 



is the term depending on the source distribution. Divid- 
ing by variance makes /3 k invariant to a . scaling of the 
source. We will refer to M k} f} k and Z k (R) as, respec- 
tively, Gcrsho's constant (in dimension k) y Zador's factor 
(for ^-dimensional, fixed-rate quantization) and the Zador- 
Gcrsho function (for dimensional, fixed-rate quantiza- 
tion). (Zador's role will be described later.) When k = 1, 
Zi(R) reduces to the Pantcr-Ditc formula (8). 

Prom the form of A£(x) one may straightforwardly de- 
duce that cells arc smaller and have higher probability 
where f(x) is larger, and that all cells contribute roughly 
the same to the distortion; i.e. Di(q) in (26) is approxi- 
mately the same for all z, which is the "partial distortion 
theorem" first deduced for scalar quantization by Pantcr 
and Ditc. 

A number of properties of M k and 0 k arc known; here, 
we mention just a few. Gcrsho's constant M k is known only 
for k = 1 and 2, where T k is, respectively, an interval and 
a regular hexagon. It is not known whether the Mjb's arc 
monotonically nonincrcasing for all k, but it can be shown 
that they form a subadditive sequence, which is a property 
strong enough to imply that the infimum over k equals the 
limit as k tends to infinity. Though it has long been pre- 
sumed, only recently has it been directly shown that the 
Mjt's tend to ^ as k increases (Zamir and Fcdcr [564]), 
which is the limit of the normalized moment of inertia of 
^-dimensional spheres as k tends to infinity. Previously, 

* A cell T "tessellates" if there exists a partition of 5ft* whose cells 
are, entirely, translations and rotations of T. The Voronoi cell of any 
lattice tessellates, but not all tessellations are generated by lattices. 
Gersho also conjectured that T h would be admissible in the sense 
that the Voronoi partition for the centroids of the tessellation would 
coincide with the tessellation. But this is not essential. 



the assertion that the M k s tend to depended on Gcr- 
sho's conjecture. Zador's factor fi k tends to be smaller 
for source densities that arc more "compact" (lighter tails 
and more uniform) and have more dependence among the 
source variables. 

Fortunately, high resolution theory need not rely solely 
on Gcrsho's conjecture, because Zador's dissertation [561] 
and subsequent memo [562] showed that for large rate S(R) 
has the form b k p k a 2 2" 2R , where 6* is independent of the 
source distribution. Thus Gcrsho's conjecture is really just 
a conjecture about 6* . 

In deriving the key result, Zador first showed that for 
a random vector that is uniformly distributed on the unit 
cube, 6(R) has the form b k 2~ 2R when R is large, which 
effectively defines 6*. (In this case, ft k a 2 = 1.) He then 
used this to prove the general result by showing that no 
quantizer with high rate could do better than one whose 
partition is hierarchically constructed by partitioning ft* 
into small equally sized cubes and then subdividing each 
with the partition of the quantizer that is best for a uniform 
distribution on that cube, where the number of cells within 
each cube depends on the source density in that cube. In 
other words, the local structure of an asymptotically opti- 
mal quantizer can be that of the optimum quantizer, for a 
uniform distribution. 

In this light, Gcrsho's conjecture is true if and only if at 
high rates one may obtain an asymptotically optimal quanr 
tizcrfor a uniform distribution by tcsscllating with T k . The 
latter statement has been proven for k = 1 (cf. [106], p. 59) 
and for k = 2 by Fcjcs Toth (1959) [159]; sec also [385]. For 
k = 3, it is known that the best lattice tessellation is the 
body-centered cubic lattice, which is generated by a trun- 
cated octahedron [35]. It has not been proven that this is 
the best tessellation, though one would suspect that it is. 
In summary, Gcrsho's conjecture is known to be true only 
for k = 1 and 2. Might it be false for k > 3? If it is, 
it might be that the best quantizers for a uniform source 
have a periodic tessellation in which two or more cell shapes 
alternate in a periodic fashion, like the hexagons and pen- 
tagons on the surface of a soccer ball. If the cells in one 
period of the tessellation have the same volumes, then one 
may apply Bennett's integral, and (30) holds with M k re- 
placed by the average of the normalized moment of iner- 
tia of the cells in one period. However, if the cells have 
unequal volumes, then as in the example given while dis- 
cussing Condition (iv) of Bennett's integral, the MSE will 
be the average of distortions computed by using Bennett's 
integral separately on the union of cells of each type, and a 
rnacrolcvcl definition of M k will be needed. It might also be 
that the structure of optimal quantizers is aperiodic. How- 
ever, it seems likely to us that, asymptotically, one could 
always find a quantizer with a periodic structure that is 
essentially as good as any aperiodic one. 

It is an open question in dimensions three and above 
whether the best tessellation is a lattice. In most dimen- 
sions the best known tessellation is .a lattice. However, 
tessellations that arc better than the best known lattices 
have recently been found for dimensions seven and nine by 
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Agrcll and Eriksson [149]. 

From now on, wc shall proceed assuming Gcrsho's con- 
jecture is correct, with the knowledge that if this is not the 
case, then analyses based on M k will be wrong (for k > 3) 
by the factor M k /b k) which will be larger than 1 (but prob- 
ably not much larger), and which in any case will converge 
to one as k — > co, as discussed later. 

C. Performance of the Best k- Dimensional, Variable- Rate 
Quantizers 

Extensions of high resolution theory to variable-rate 
quantization can also be based on Bennett's integral, 
as well as approximations, originally due to Gish and 
Pierce [204], to the entropy of the output of a quan- 
tizer. Two such approximations, which can be derived 
using approximations much like those used to derive Ben- 
nett's integral, were stated earlier for scalar quantizers in 
(11) and (13). However, the approximation (13), which 
says that for quantizers with mostly small cells H(q) = 
h(X) + E[log A(X)], where A(x) is the unnormali zed point 
density, holds equally well for vector quantizers, when X 
is interpreted as a vector rather than a scalar variable. As 
mentioned before, unnormalizcd point density is used be- 
cause with variable-rate quantization, the number of code- 
vectors is not a primary characteristic and may even be 
infinite. For example, one can always add levels in a way 
that has negligible impact on the distortion and entropy. 

Wc could now proceed to use Bennett's integral and the 
entropy approximation to find the operational distortion- 
rate function for variable- rate, ^-dimensional, memory less 
VQ. However, wc wish to consider a somewhat more gen- 
eral case. Just as Gish and Pierce found something quite 
interesting by examining the best possible performance of 
scalar quantization with block entropy coding, wc will now 
consider the operational distortion-rate function for vector 
quantization with block entropy coding. Specifically, wc 
seek 8k t L{R), which is defined to be the infimum of the dis- 
tortions of any quantizer with rate R or less, whose lossy 
encoder is Ar-dimcnsional and mcmorylcss, and whose loss- 
less encoder simultaneously codes a block of L successive 
quantization indices with a variable-length prefix code. In 
effect, the overall code is a fcL-dimcnsional, mcmorylcss 
VQ. However, wc will refer to it as a ^-dimensional (mcm- 
orylcss) quantizer with L-th order variable-length coding 
(or Z-th order entropy coding). When L = 1, the code 
becomes a conventional mcmorylcss, variable-rate vector 
quantizer. It is convenient to let L = 0 connote fixed- 
length coding, so that 8 ki o(R) means the same as S k (R) of 
the previous section. By finding high resolution approxi- 
mations to 8 k) i{R) for all values of k > 1 and L > 0, wc 
will be able to compare the advantages of increasing the di- 
mension k of the quantizer to those of increasing the order 
L of the entropy coder . 

To find 6k t i(R) wc assume that the source produces a 
sequence (2Cj, . . . ,2Cjr ) of identical, but not necessarily in- 
dependent, fc-dimcnsional random vectors, each with den- 
sity f(x). A straightforward generalization of (13) shows 



that under high resolution conditions, the rate is given by 

R=^h(X u ...,X kL )-r±Jf(x)\ogA(x)dx. (31) 

On the other hand, the distortion of such a code may be 
approximated using Bennett's integral (27), with A(x)/N* 
substituted for the normalized point density A(x). Then, as 
with fixed- rate vector quantization, one would like to find 
f>k,L{R) by choosing the incrtial profile m and the point 
density A to minimize Bennett's integral subject to a con- 
straint on the rate that the right-hand side of (31) be at 
most R. 

Once again, though it is not known how to find the best 
incrtial profile, Gcrsho's conjecture suggests that when rate 
is large, the cells of the best ratc-constraincd quantizers 
arc, mostly, congruent to T k . Hence, from now on wc shall 
' assume that the incrtial profile of the best variable-rate 
quantizers is, approximately, m(x) = M k . In this case, 
using variational techniques or simply Jensen's inequality, 
one can show that the best point density is uniform on all 
of $l k (or at least over the support of the source density). 
In other words, all quantizer cells have the same size, as 
in a tessellation. Using this fact along with (27) and (31) 
yields 

6 k M = M k7kL * 2 2~ 2R = Z k>L (R), ' (32) 

where 

is the term depending on the source distribution. Dividing 
by variance makes it invariant to scale. Wc call y k the (k-th 
order) Zador entropy factor and Z kt t(R) a Zador-Gcrsho 
function for variable-rate coding. Since fixed-rate coding 
is a special case of variable-length coding, it must be that 
y k is less than or equal to j3 k in (30) . This can be directly 
verified using Jensen's inequality [193]. 

In the case of scalar quantization (k = 1), the opti- 
mal ity of the uniform point density and the operational 
distort ion- rate function 6\ } l(R) were found by Gish and 
Pierce (1968) [204]. Zador (1966) [562] considered the L = 
1 case and showed that S k) \(R) has the form c k y k cr 2 2~ 2R 
when R is large, where c k is a constant that is indepen- 
dent of the source density and no larger than the constant 
b k that he found for fixed-rate quantization. Gcrsho [193] 
used the argument given above to find the form of 6 k} i(R) 
given in (32). 

As with fixed-rate quantization, wc shall proceed un- 
der the assumption that Gcrsho's conjecture is correct, in 
which case c k = b k = M k . If it is wrong, then our analy- 
ses will be off by the factor M k /c kl which, as before, will 
probably be just a little larger than one, and which in any 
case will converge to one as k — ► oo. 

D. Fixed- Rate Quantization with Arbitrary Dimension 

Wc now restrict attention to the random process domain 
wherein the source is assumed to be a one-dimensional, 
scalar- valued, stationary random process. Wc seek a high 
resolution approximation to the operational distortion-rate 
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function 8(R) = infjb $k(R)i which represents the best pos- 
sible performance of any fixed-rate (memory less) quan- 
tizer. As mentioned in Section III, for stationary sources 
S(R) = Iimjb_>oo 6k(R)- Therefore, taking the limit of the 
high resolution approximation (30) for 6k(R) yields the fact 
that for large R 



vherc 



and 



6(R) S M /3v 2 2- 2R = Z(R) 



M = lim M k = - — , 
ib — oo 2ixc 

P = lim A, 

k—*OQ 



Z(R) = lim Zk(R) 



(33) 



is another Zador-Gcrsho function. This operational 
distortion- rate function was also derived by Zador [561], 
who showed that his unknown factors bk and Ck converged 
to 2^7. The derivation given here is due to Gcrsho [193]. 
Notice that in this limiting case, there is no doubt about 
the constant M. 

As previously mentioned the Mjb's arc subadditive, so 
that they arc smallest when k is large. Similarly, for sta- 
tionary sources it can be shown that the sequence {log/?*;} 
is also subadditive [193], so that they too arc smallest when 
k is large. Therefore another expression for the above 
Zador-Gcrsho function is Z(R) = inf* Zk(R)- 

E. The Benefits of Increasing Dimension in Fixed-Rate 
Quantization 

Continuing in the random process domain (stationary 
sources), the generally decreasing natures of Mk and 
directly quantify the benefits of increasing dimension in 
fixed-rate quantization. (Of course, there is also a cost to 
increasing dimension, namely, the increase in complexity.) 
For example, Mk decreases from r~ = .0833 for k = 1 
to the limit ^ = .0586. In decibels, this represents a 
1.53 dB decrease in MSE. For an i.i.d. Gaussian source, 
fik decreases from 6\/37r = 32.6 for k = 1 to the limit 
2wc = 17.1, which represents an additional 2.81 dB gain. 
In total, high dimensional quantization gains 4.35 dB over 
scalar quantization for the i.i.d. Gaussian source. For a 
Gauss-Markov source with correlation coefficient p = .9, 
Pk decreases from 6\/3*" = 32.6 for k = 1 to the limit 
2ttc(1 -p 2 ) = 3.25 or a gain of 10.0 dB, yielding a total high 
dimensional VQ gain of 11.5 dB over scalar quantization. 
Because of the 6 dB per bit rule, any gain stated in dB 
can be translated to a reduction in rate (bits/sample) by 
dividing by 6.02. 

On the other hand, it is also important to understand 
what specific characteristics of vector quantizers improve 
with dimension and by how much. Motivated by several 
prior explanations [342], [333], [365], we offer the follow- 
ing. We wish to compare an optimal quantizer, gjb , with 
dimension k to an optimal ^'-dimensional quantizer, qv 
with k f » k. To simplify the discussion, assume k' is a 



multiple of k. Though these two quantizers have differing 
dimensions, their characteristics can be fairly compared by 
by comparing qy to the "product" VQ <7p r ,k' that is im- 
plicitly formed when qk is used k'/k times in succession. 
Specifically, the product quantizer has quantization rule 

qpr,k'( x ) = (ffcfei)i ■ • -^fcfeb'/Jb))) 

where x ly . . .,x k i/ k arc the successive fc-tuplcs of x, and 
reproduction co deb 00k C pr ,k' consisting of the concatena- 
tions of all possible sequences of k'/k codcvcctors from qk 's 
reproduction codebook C*. The subscripts "k" and "pr,k'" 
will be attached as needed to associate the appropriate fea- 
tures with the appropriate quantizer. The distortion and 
rate of the product quantizer arc easily seen to be those 
of the fc-dimcnsional VQ. Thus the shortcomings of an op- 
timal fc-dimcnsional quantizer relative to an optimal high 
dimensional quantizer may be identified with those of the 
product quantizer — in particular, with the latter } s sub- 
optimal point density and incrtial profile, which we now 
find. 

To simplify discussion, assume for now that k = 1, and 
let gi be a fixed-rate scalar quantizer, with large rate, lev- 
els in the middle of the cells, and point density A sq (xi). 
The cells of the product quantizer tf pr ,k' arc ^'-dimensional 
rectangles formed by Cartesian products of cells from the 
scalar quantizer. When the scalar cells have the same 
width, a fc'-dimcnsional cube is formed; otherwise a rect- 
angle is formed, i.e. an "oblong" cube. Since the widths 
of the cells arc, approximately, determined by A sq (xi), the 
point density and incrtial profile of q pr ^ arc determined 
by A sq . Specifically, from the rectangular nature of the 
product cells one obtains [365], [378] 



A prj k'(x) = 
and 

"V,k'( x ) = 



t=i 



1 k' 2^i=i a£7^7J 



12 



1 j 



(34) 



(35) 



which derive, respectively, from the facts that the volume 
of a rectangle is the product of its side lengths, that the 
normalized moment of inertia of a rectangle is that of a 
cube (1/12) times the ratio of the arithmetic mean of the 
square of the side lengths to their geometric mean, and that 
the side lengths arc determined by the scalar point den- 
sity. Note that along the diagonal of the first "quadrant" 
(where xi = xi = . . . = x*/), the product cells arc cubes 
and rn pr y(x) = 1/12, the minimum value. Off the diag- 
onal, the cells arc usually rectangular and, consequently, 
m prjk /(x) is larger. 

To quantify the suboptimality of the product quantizer's 
principal feature, we factor the ratio of the distortions of 
g pr k y(x) and q^, which is a kind of loss, into terms that 
reflect the loss due to the incrtial profile and point den- 
sity [365], [379]* 

5 Na and Neuhoff considered the ratio of the product code distortion 
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L = 



fl(gpr,k') 



£(&', m pr k *, A prk /,/) 
B{k\M k ,^l,J) 



6 k '(R) 

B(fc',ra P r,k', A pr ,k',/) B(k',Mk<, A pr>k ',/) 

B(^,M fc sA prik ,,/) X B(*',Af*/,A*„/) 
= L ce x L ptj (36) 



where 



B(k,m,X,f) = f ^\f(x)dx 



is the part of Bennett's integral that docs not depend on 
N } where the cell shape loss, L ce , is the ratio of the dis- 
tortion of the product quantizer to that of a hypothetical 
quantizer with same point density and an optimal incrtial 
profile, and where the point density loss, L p t, is the ratio 
of the distortion of a hypothetical quantizer with the point 
density of the product quantizer and a constant (e.g. op- 
timal) incrtial profile to that of a hypothetical quantizer 
with an optimal point density and the same (constant) in- 
crtial profile. Substituting (35) into (36) and using the fact 
that for large W ', Mw - one finds 



-5-fk>(x) dx 
17 



L = — 



7TC 

6-*/ 



(IE, *(•«>) 



J — Tgr Mx)dx 



= L 



sp 



J-TZ — fh>{x)dx 



(37) 



where the cell shape loss has been factored into the product 
of a space filling loss [333] 6 , £ S p, which is the ratio of the 
normalized moment of inertia of a cube to that of a high 
dimensional sphere, and an oblongitis loss, L 0 h, which is 
the factor by which the rcctangularity of the cells makes 
the cell shape loss larger than the space filling loss. 

To proceed further, consider first an i.i.d. source (sta- 
tionary and mcmorylcss) and consider how to choose the 
scalar point density A sq (xi) in order to minimize L. On the 
one hand, choosing A sq (:ci) to be uniform on the set where 
the one-dimensional density 7 /i(xi) is not small causes the 
product cells in the region where the ^'-dimensional density 
/jfc/(x) is not small to be cubes and, consequently, makes 
L 0 k = 1, which is the smallest possible value. However, it 
causes the product point density to be poorly matched to 
the source density and, as a result, L pfc is large. On the 
other hand, choosing A sq (xi) = fi(%i) causes the product 

to that of an optimal A:-dimensional VQ for arbitrary k } not just for 
large k. 

6 Actually, Lookabaugh and Gray defined the inverse as a vector 
quantizer advantage. The space-filling loss was called a cubic loss 
in [365] 

7 Dimension will be added as a subscript to / in places where the 
dimension of X needs to be emphasized. 



quantizer to have, approximately, the optimal point den- 
sity 8 : A pr} jb<(a;) = nLi/i( x 0 = /*'(*) - K'( x )> whcrc 
the last step uses the fact that k f is large. However, this 
choice causes L n h to be infinite 9 . The best point density, 
as implicitly found by Pantcr and Ditc, is the compromise 

Ai{x\) = 1 , 

//*(«)<*« 

as given in (29). In the region whcrc fi(xi) is not small, 
A*(xi) is "more uniform" than Ai(xi) = f\(xi) that causes 
the product quantizer to have the optimum point density. 
Therefore, it generates a product quantizer whose cells in 
the region whcrc fk'{x) is largest arc more cubic, which 
explains why it has less oblongitis loss. 

As an example, for an i.i.d. Gaussian source, the optimal 
choice of scalar quantizer causes the product quantizer to 
have 0.94 dB oblongitis loss and 1.88 dB point density loss. 
The sum of these, 2.81 dB, which equals 101og 10 /?i//?, has 
been called the "shape loss" [333] because it is determined 
by the shape of the density — the more uniform the den- 
sity the less need for compromise because the scalar point 
densities leading to best product cell shapes and best point 
density arc more similar. Indeed, for a uniform source den- 
sity, there is no shape loss. In summary, for an i.i.d. source, 
in comparison to high dimensional quantization, the short- 
comings of scalar quantization with fixed-rate coding arc: 
(1) the L sp = 1.53 dB space filling loss and (2) the lack of 
sufficient degrees of freedom to simultaneously attain good 
incrtial profile (small L 0 b) and good point density (small 
Z p t). On the other hand, it is often surprising to new- 
comers that vector quantization gains anything at all over 
scalar quantizers for i.i.d. sources, and secondly, that the 
gain is more than just the recovery of the space filling loss. 

A similar comparison can be made between k- 
dimensional (k > 2) and high dimensional VQ, by com- 
paring the product quantizer formed by k'/k uses of a k- 
dimcnsional VQ to an optimal ^'-dimensional quantizer, 
for large A;'. The results arc that as k increases: (1) the 
space filling loss L Rp = Mk/-^ decreases, and (2) there 
arc more degrees of freedom so that less compromise is 
needed between the A:-dimcnsional point density that min- 
imizes oblongitis and the one that gives the optimal point 
density. As a result the oblongitis, point density and shape 
losses decrease to zero, along with the space filling loss. 
For the i.i.d. Gaussian source, these losses arc plotted in 
Figure 5. 

For sources with memory, scalar quantization (k = 1) 
engenders an additional loss due to its inability to exploit 
the dependence between source samples. Specifically, when 
there is dependence/correlation between source samples, 
the product point density cannot match the ideal point 
density, not even approximately. Sec [333], [365] for a defi- 
nition of memory loss. (One can factor both the point den- 
sity and oblongitis losses into two terms, one of which is 

8 The fact that product quantizers can have the optimal point den- 
sity is often overlooked. 
9 This implies that distortion will not decrease as 2~ 2R . 
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Fig. 5. Losses of optimal /c-dimensional quantization relative to opti- 
mal high- dimensional quantization for an i.i.d. Gaussian source. 
The bottom curve is point density loss; above that is point den- 
sity loss plus oblongitis loss; and the top curve is the total loss. 
For A: > 4, the space filling losses are estimates. 

due to the quantizer's inability to exploit memory.) There 
is also a memory loss for fc-dimcnsional quantization, which 
decreases to 1 as A: increases. The value of k for which 
the memory loss becomes close to unity (i.e. negligible) 
can be viewed as kind of "effective memory or correlation 
length" of the source. It is closely related to the dccorrcla- 
tion/indcpcndcncc length of the process, i.e. the smallest 
value of k such that source samples arc approximately un- 
corrected when separated by more than k. 

F. Variable-Rate Quantization with Arbitrary Quantizer 
Dimension and Entropy Coding Order 

We continue in the random process domain (stationary 
sources). To find the best possible performance of vec- 
tor quantizers with block entropy coding over all possible 
choices of the dimension k of the lossy encoder and the 
order L of the entropy coder, we examine the high res- 
olution approximation (32), which shows that Sk,L(R) — 
Mk7kL<7 2 2~ 2R . As mentioned previously, the Mjb's arc 
subadditive, so choosing k large makes Mk as small as 
possible, namely as small as M. Next, for stationary 
sources, it is well known that fc-th order differential en- 
tropy hk = j^h{X\ , . . . , Xk) is monotonically nonincrcas- 
ing in k. Therefore, choosing cither k or L large makes 
y kL = 2 2hkL as small as possible, namely as small as 
7 = limjfc^oo 7k- Interestingly, 7 = /? = lim^oo j3 kl as 
shown by Gcrsho [193], who credits Thomas Liggett. It 
follows immediately that the best possible performance of 
vector quantizers with block entropy coding is given by 
S(R) = M /3<t 2 2~~ 2R } which is the operational distortion- 
rate function of fixed-rate quantizers. In other words, en- 
tropy coding docs not permit performance better than high 
dimensional fixed- rate quantization. 

Let us now reexamine the situation a bit more carefully. 
We may summarize the various high resolution approxima- 
tions to operational distortion-rate functions as 

SkM = M k a KL <T 2 2- 2R , k>\ t L>0 } (38) 

where by convention L = 0 refers to fixed-rate coding, L > 



1 refers to Lth-ordcr entropy coding and 

Note that both Mfc's and o^l's tend to decrease as k or 
L increase. (The Mk's and the log/Vs arc subadditive. 
The 7VS arc nonin creasing.) As an illustration, Figure 6 
plots 10 log 10 ajfc } £ (in dB) vs. k and L for a Gauss- Markov 
source with correlation coefficient p = 0.9. 

Consider how Sk,L(R) decreases, i.e. improves, with k 
and L increasing. On the one hand for fixed fc, it decreases 
with increasing L (actually, it is monotonically nonincrcas- 
ing) to 

6 kiOQ (R) = Mkp<T 2 2~ 2R = =fi(R). (39) 

M 

Thus, fc-dimcnsional quantization with high order entropy 
coding suffers only the fc-dimcnsional space filling loss. On 
the other hand for fixed Z, S k)L {R) decreases with k (actu- 
ally it is subadditive) to 

6^ iL {R) = MP<r 2 2~ 2R = 6(R). (40) 

Hence, high dimensional quantization suffers no loss rela- 
tive to the best possible performance, no matter the order 
or absence of an entropy coder. 

From the above, we sec that to attain performance close 
to £(#), k must be large enough that the space filling loss 
is approximately one, and the combination of k and L 
must be large enough that Sj*^ is also approximately one. 

Regarding the first of these, even k = 1 (scalar quantiza- 
tion) yields M± = = 1.42, representing only a 1.53 dB 
loss, which may be acceptable in many situations. When it 
is not acceptable, k needs to be increased. Unfortunately, 
as evident in Figure 5, the space-filling loss decreases slowly 
with increasing k. Regarding the second, we note that one 
has considerable freedom. There arc two extreme cases: 
(1) k large and L = 0, i.e. fixed-rate, high dimensional 
quantization, or (2) L large and k = 1, i.e. scalar quan- 
tization with high order entropy coding. In fact, uniform 
scalar quantization will suffice in the second case. Alterna- 
tively, one may choose moderate values for both k and L. 
Roughly speaking, kL must be approximately equal to the 
effective memory length of the source plus the. value needed 
for a memory less source. In effect, if the source has con- 
siderable memory, such memory can be exploited cither by 
the lossy encoder (k large) or the lossless encoder (L large) 
or both (moderate values of k and L). Moreover, in such 
cases the potential reductions in <Xk,L due to increasing k or 
L tend to be much larger than the potential reductions in 
the space-filling loss. For example for the Gauss- Markov 
source of Figure 6, otk,o = Pk decreases 10.0 dB as A: in- 
creases from one to infinity, and has already decreased 8.1 
dB when k — 6. 

From the point of view of the lossy encoder, the benefit of 
entropy coding is that it reduces the dimension required of 
the lossy encoder. Similarly, from the point of view of the 
lossless encoder, the benefit of increasing the dimension of 
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Fig. 6. 10log 10 cxk.r. for a Gauss-Markov source with correlation 
coefficient 0.9. 

the vector quantizer is that it decreases the order required 
of the lossless encoder. Stated another way, the benefits of 
entropy coding decrease with increasing quantizer dimen- 
sion, and the benefits of increasing quantizer dimension de- 
crease with increasing entropy coding order. In summary 
(cf. [377]), optimal performance is attainable with and only 
with a high dimensional lossy encoder, and with or without 
entropy coding. However, good performance (within 1.53 
dB of the best) is attainable with uniform scalar quantizer 
and high order entropy coding. Both of these extreme ap- 
proaches arc quite complex, and so practical systems tend 
to be compromises with moderate quantizer dimension and 
entropy coding order. 

As with fixed-rate quantization, it is important to under- 
stand what specific characteristics of variable-rate quantiz- 
ers cause them to perform the way they do. Consequently, 
we will take another look at variable-rate quantization, this 
time from the point of view of the point density and incr- 
tial profile of the high dimensional product quantizer in- 
duced by an optimal low dimensional variable-rate quan- 
tizer. The situation is simpler than it was for fixed-rate 
quantization. As mentioned earlier, when rate is large, an 
optimal ^-dimensional variable-rate quantizer has a uni- 
form point density and a partition and codebook formed by 
tcsscllating Tfc. Suppose k is small and k' is large multiple 
of Ar. From the structure of optimal variable- rate quantiz- 
ers, one sees that using an optimal ^-dimensional quantizer 
k 1 )k times yields a dimensional quantizer having the 
same (uniform) point density as the optimal fc'-dimcnsional 
quantizer and differing, mainly, in that its incrtial profile 
equals the constant M*, whereas that of the optimal k'- 
dimcnsional quantizer equals M& = M. Thus, the loss 
due to fc-dimcnsional quantization is only the space filling 
loss Mk/M } which explains what Gish and Pierce found for 
scalar quantizers in 1968 [204]. We emphasize that there 
is no point density, oblongitis or memory loss, even for 
sources with memory. In effect, the entropy code has elim- 
inated the need to shape the point density, and as a result, 
there is no need to compromise cell shapes. 



Finally, let us compare the structure of the fixed- rate 
and variable-rate approaches when dimension is large. On 
the one hand, optimal quantizers of each type have the 
same constant incrtial profile, namely, rn(x) £ M*. On 
the other hand, they have markedly different point den- 
sities: an optimal fixed-rate quantizer has point density 
X I (x) = fk(z)> whereas an optimal variable-rate quantizer 
has point density that is uniform over all of 5ft k . How is it 
that two such disparate point densities do in fact yield the 
same distortion? The answer is provided by the asymp- 
totic cquipartition property (AEP) [110], which is the key 
fact upon which most of information theory rests. For a 
stationary, crgodic source with continuous random vari- 
ables, the AEP says that when dimension k is large, the k- 
dimcnsional probability density is approximately constant, 
except on a set with small probability. More specifically it 
shows Pr(.Y E Tjb) = 1, where 

T* = {*eS^:~logA(aOS^} 

is a set of typical sequences^ where = limjb_,oo ^Jb is the 
differential entropy rate of the source. It follows immedi- 
ately from the AEP and the fact that A£(a:) = fk(x) that 
the point density of an optimal fixed-rate quantizer is ap- 
proximately uniform on 7* and zero elsewhere. Moreover, 
for an optimal variable- rate quantizer, whose point density 
is uniform over all of 5ft*, we sec that the cells not in 7* 
can be ignored, because they have negligible probability, 
and that the cells in Tk all have the same probability and, 
consequently, can be assigned codewords of equal length. 
Thus both approaches lead to quantizers that arc identical 
on 7i (uniform point density and fixed-length codewords) 
and differ only in what they do on the complement of 7i, 
a set of negligible probability. 

It is worthwhile emphasizing that in all of the discussion 
in this section we have restricted attention to quantizers 
with mcmorylcss lossy encoders and cither fixed-rate, mem- 
ory less or block lossless encoders. Though there arc many 
lossy and lossless encoders that arc not of this form, such 
as DP CM or finite-state, predictive or address vector VQ, 
and Lcmpcl-Ziv or arithmetic lossless coding, we believe 
that the easily analyzed case studied here shows, represen- 
tatively, the effects of increasing memory in the lossy and 
lossless encoders. 

G. Other Distortion Measures 

By far the most commonly assumed distortion measure 
is squared error, which for scalars is defined by d(x,y) = 
| a: — y| 2 and for vectors is defined by 

jb 

4(*>y) = 53l*i--wl a > 

where x = (aj 1} . . . , x k ). Often the distortion is normalized 
by 1/k. A variety of more general distortion measures have 
been considered in the literature, but the simplicity and 
tractability of squared error has long given it a central role. 
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Intuitively, the average squared error is the average energy 
or power in the quantization noise. The most common ex- 
tension of distortion measures for scalars is the rth power 
distortion, d(x ) y) = \x - y| r . For example, Roc [443] gen- 
eralized Max's formulation to distortion measures of this 
form. Gish and Pierce [204] considered a more general dis- 
tortion measure of the form d(x ) y) = L(x — y), where L 
is a monotone increasing function of the magnitude of its 
argument and L(0) = 0 with the added property that 

1 r /2 

M(v) = - / L{u)du 

v J-v/2 

has the property that vM'{v) is monotone. None of these 
distortion measures has been widely used, although the 
magnitude error (rth power with r = 1) has been used in 
some studies, primarily because of its simple computation 
in comparison with the squared error (no multiplications). 

The scalar distortion measures have various generaliza- 
tions to vectors. If the dimension is fixed, then one needs 
only a distortion measure, say d k (x ) y) 1 defined for all 
x,y € 9ft*. If the dimension is allowed to vary, however, 
then one requires a family of distortion measures d k (x, y), 
k = 1,2,..., which collection is called a fidelity criterion in 
source coding theory. Most commonly it is assumed that 
the fidelity criterion is additive or single letter in the sense 
that 

M(*i>--->**)>(tt>--->yfc)) = 

•4((si,...,*i)>(ft>--->yj)) + 

d k -i((xi+i , . . . , x k ), (y /+1 , . . . , yjb)), (41) 

for / = 1, 2, . . . , k — 1, or, cquivalcntly, 

k 

d k ({x u . . . , x fc ), (yi, . . . , yjb)) = ^ c?i(a? t -, y t ). (42) 

i=i 

Additive distortion measures arc particularly useful for 
proving source coding theorems since the normalized dis- 
tortion will converge under appropriate conditions as the 
dimension grows large, thanks to the crgodic theorem. One 
can also assume more generally that the distortion measure 
is subbadditivc in the sense that 

d k ((x u ... ) x k ),(y u ... ) y k )) < 
di((x Xi ...,»/), (yi,. ..,y/)) + 

c4-/((z/+i , • • ■ , x k ), (y J+1 , . . . , yjt)), (43) 

and the subadditive crgodic theorem will still lead to posi- 
tive and negative coding theorems [340], [218]. 10 An exam- 
ple of a subadditive distortion measure is the Lcvcnshtcin 
distance [314] which counts the number of insertions and 
deletions along with the number of changes that it takes to 
convert one sequence into another. Originally developed 

10 This differs slightly from the previous definition of subadditive 
because the dk are not assumed to be normalized. The previous 
definition applied to dk/k is equivalent to this definition. 



for studying error correcting codes, the Lcvcnshtcin dis- 
tance was rediscovered in the computer science community 
as the "edit distance." 

For a fixed dimension k one can observe that the squarcd- 
crror distortion measure can be written as ||x — y|| 2 , where 
||x - y|| is the / 2 norm 

\\*-y\\ = (jb\ x <-y<\ 2 J ■ 

This idea can be extended by using any power of any l p 
norm, e.g., 

d(x,y) = \\x-y\\;, 

where 

H*- flip = (ij !•«-*»!')'• 

(In this notation the I2 norm is || • H2.) If we choose p = r, 
then this distortion measure (sometimes referred to sim- 
ply as the rth power distortion) is additive. Zador [562] 
defined a very general rth power distortion measure as 
any distortion measure of the form d(x^y) = p(x • — y) 
where for any a > 0, p{ax) = a r p(\x\\, . . . , |xfc|), for some 
r > 0. This includes rth power distortion in the narrow 
sense ||x — y\\ r 2) as well as the additive distortion mea- 
sures of the form \\x - y||J! == £^ =1 \x{ — y t | r , and even 
weighted average distortions such as ^X^=i Wi\x{ — yi| 2 ^ 

and J^jLj vl\\x{ — y^| r , where the w^s arc nonncgativc. 

A variation on the l p norm is the l^ norm defined by 
\\x — y\\oo = max,- |x t - — y t |, which has been proposed as a 
candidate for a perceptually meaningful norm. Quantizer 
design algorithms exist for this case, but to date no high 
resolution quantization theory or rate distortion theory has 
been developed for this distortion measure (cf. [347], [231], 
[348]). 

High resolution theory usually considers a fixed dimen- 
sion k, so neither additivity nor a family of distortion 
measures is required. However, high resolution theory 
has tended to concentrate on difference distortion mea- 
sures, i.e., distortion measures that have the form c/(x, y) = 
L(x — y) ) where x — y is the usual Euclidean difference and 
L is usually assumed to have nice properties, such as being 
monotonic in some norm of its argument. The rth power 
distortion measures (of all types) fall into this category. 

Recently the basic results of high resolution theory have 
been extended to a family of nondiffcrcncc distortion mea- 
sures that arc locally quadratic in the sense that provided 
x = y, the distortion measure is given approximately by a 
Taylor scries expansion as (x — y)* B(y)(x — y), where B(y) 
is a positive definite weighting matrix that depends on the 
output. This form is ensured by assuming that the distor- 
tion measure rf(a?, y) has continuous partial derivatives of 
third order almost everywhere and that the matrix B(y) 
defined as a k by k dimensional matrix with the (j, n)th 
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clement 



B *»W ~ 2~dx~dx7 



with optimal incrtial profile m(x) = Mk and optimal point 
density 



(44) 



(det(B(s)))* 
V ^ /(det(B(x')))*dx< 



(50) 



is positive definite almost everywhere. The basic idea for 
this distortion measure was introduced by Gardner and 
Rao [186] to model a perceptual distortion measure for 
speech, where the matrix B(y) is referred to as the "sen- 
sitivity matrix. 55 The requirement for the existence of the 
derivatives of third order and for the B(y) to be positive 
definite were added in [316] as necessary for the analy- 
sis. Examples of distortion measures meeting these con- 
ditions arc the time-domain form of the Itakura-Saito dis- 
tortion [258], [259], [257], [224], which has the form of an 
input- weighted quadratic distortion measure of the form of 
(21). For this case the input weighting matrix W x is related 



Both results reduce to the previous results for the spe- 
cial case of a squarcd-crror distortion measure since then 
dct(B(x)) = 1. Note in particular that the optimal point 
density for the entropy-constrained case is not in general a 
uniform density. 

Parallel results for Shannon lower bounds to the rate- 
distortion function have been developed for this family of 
distortion measures by Lindcr and Zamir [323] and results 
for multidimensional companding with lattice codes for 
similar distortion measures have been developed by Lin- 
dcr, Zamir, and Zcgcr [325]. 



to the partial derivative matrix by B(x) = \{W X + W£), Rigorous Approaches to High Resolution Theory 



so that positive definitcness of W x assures that of B(x) 
and the derivative conditions arc transferred to W x - Other 
distortion measures satisfying the assumptions arc the im- 
age distortion measures of Eskicioglu and Fisher [150] and 
Nill [386], [387]. The Bennett integral has been extended to 
this type of distortion, and approximations for both fixed- 
rate and variable-rate operational distortion-rate functions 
have been developed [186], [316]. For the fixed rate case, 
the result is that 

D(q)Z±Jf(x)(dct{B(x)))i^dx , (45) 

where the modified incrtial profile m(x) is assumed to be 
the limit of 



M(S i ,y i ) = (dct(B(y i ))-* 



■ / s .(g< - yiyB{yi)(xj - yj)dx 



A natural extension of Gcrsho's conjecture to the non- 
difference distortion measures under consideration implies 
that, as in the squarcd-crror case, the optimal incrtial pro- 
file is assumed to be constant (which in any case will yield 
a bound) and minimizing the above (for example using 
Holders inequality) yields the optimal point density 



A(x) = 



(/(«)(dct(fl(»)))*)«fr 



J(f{x')(dct(B(x')))i)^dx' 

and the operational distortion-rate function (analogous to 
(30)) 

6(R) = Mkftk<T 2 2~ 2R , (47) 

where now 

Pk = 3? {/ (/(«)(det(*(«)))*) * dx} (48) 

generalizes Zador 5 s factor to the given distortion measure. 
As shown later in (58), Mk can be bounded below by the 
moment of inertia of a sphere. Similarly, in the variable- 
rate case 

6(R) £ M*2* ( * ( * )+ * I »og(Het(B(x)))f(x)Hx)2-2« ^ (49) 



Over the years, high resolution analyses have been pre- 
sented in several styles. Informal analyses of distortion, 
such as those used in this paper to obtain A 2 /12 and 
Bennett's integral (26), generally ignore overload distor- 
tion and estimate granular distortion by approximating the 
density as being constant within each quantization cell. In 
contrast, rigorous analyses generally focus on sequences of 
ever finer quantizers, for which they demonstrate that, in 
the limit, overload distortion becomes negligible in compar- 
ison to granular distortion and the ratio of granular distor- 
tion to some function of the fineness parameter tends to a 
constant. Though informal analyses generally lead to the 
same basic results as rigorous ones, the latter make it clear 
that the approximations arc good enough that their per- 
centage errors decrease to zero as the quantizers become 
finer, whereas the former do not. Moreover, the rigorous 
derivations provide explicit conditions under which the as- 
sumption of negligible overload distortion is valid. Some 
analyses (informal and rigorous) provide corrections for 
overload distortion, and some even give examples where 
the overload distortion cannot be asymptotically ignored 
but can be estimated nevertheless. Similar comments ap- 
ply to informal vs. rigorous analyses of asymptotic entropy. 
In the following wc review the development of rigorous thc- 



(46) or y 



Many analyses — informal and rigorous — explicitly as- 
sume the source has finite range (i.e. a probability dis- 
tribution with bounded support); so there is no overload 
distortion to be ignored [43], [405], [474]. In some cases 
the source really docs have finite range. In others, for ex- 
ample speech and images, the source samples have infinite 
range, but the measurement device has finite range. In 
such cases, the truncation by the measurement device cre- 
ates an implicit overload distortion that is not affected by 
the design of the quantizer. It makes little sense, then, 
to choose a quantizer so fine that its (granular) distortion 
is significantly less than this implicit overload distortion. 
This means there is an upper limit to the fineness of quan- 
tizers that need be considered, and consequently, one must 
question whether such fineness is small enough that the 
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source density can be approximated as constant within 
cells. Some analyses do not explicitly assume the source 
density has finite support, but merely assert that overload 
distortion can be ignored. We view that this differs only 
stylistically from an explicit assumption of finite support, 
for both approaches ignore overload distortion. However, 
assuming finite support is, arguably, humbler and mathe- 
matically more honest. 

The earliest quantizer distortion analyses to appear in 
the open literature [43], [405], [474] assumed finite range 
and used the dcnsity-approximatcly-constant-in-cclls as- 
sumption. Several papers avoided the latter by using a 
Taylor scries expansion of the source density. For exam- 
ple, Lloyd [330] used this approach to show that, ignoring 
overload distortion, the approximation error in the Pantcr- 
Ditc formula is o(l /N 2 ), which means that it tends to zero, 
even when multiplied by N 2 . Roc [443], Algazi [8] and 
Wood [539] also used Taylor scries. 

Overload distortion was first explicitly considered in the 
work of Shtcin (1959) [471], who optimized the cell size of 
uniform scalar quantization using an explicit formula for 
the overload distortion (as well as A 2 /12 for the granular 
distortion) and while redcriving the Pantcr-Ditc formula, 
added an overload distortion term. 

The earliest rigorous analysis 11 is contained in Schutzcn- 
bcrgcr's 1958 paper [462], which showed that for k- 
dimcnsional variable-rate quantization (L = 1), rth power 
distortion (||af- y|| r ), and a source with finite differential 
entropy and E [|pn| r 'j < co for some r' > r, there is 
a Kk >r > 0, depending on the source and the dimension, 
such that any ^-dimensional quantizer with finitely or in- 
finitely many cells, and output entropy H y has distortion 
at least K kjr 2^ r ^ k ^ H . Moreover, there exists K' k>r > K kt r 
and a sequence of quantizers with increasing output en- 
tropies H and distortion no more than K kr 2~( r / k ) H . In 
essence, these results show that 

Kk r2 (-rMK < Sktl (R) < i^ r 2(- r / fc ) R , for all R. 

Unfortunately, as Schutzcnbcrgcr notes, the ratio of K' k r 
to K kr tends to infinity as dimension increases. As he in- 
dicates, the problem is that in demonstrating the upper 
bound, he constructs a sequence of quantizers with cubic 
cells of equal size and then bounds from above the distor- 
tion in each cell by something proportional to its diameter 
to the rth power. If instead one were to bound the distor- 
tion by the moment of inertia of the cell times the maxi- 
mum value of the density within it, then K' kr /K k>r would 
not tend to infinity. 

Next, two papers appeared in the same issue of Acta 
Math. Acad. Sci. Hungar. in 1959. The paper by 
Rcnyi [433] gave, in effect, a rigorous derivation of (11) 
for a uniform quantizer with infinitely many levels. Specif- 
ically, it showed that H(q n (X)) = h(X) + logn + o(l), pro- 
vided that the source distribution is absolutely continuous 

11 Though Lloyd [330] gave a fairly rigorous analysis of distortion, 
we do not include his paper in this category because it ignored over- 
load distortion. 



and that H(q n (X)) and h(X) arc finite, where q n denotes 
a uniform quantizer with step size ~ and o(l) denotes a 
quantity that approaches zero as n goes to oo. They paper 
also explores what happens when the distribution is not 
absolutely continuous. 

In the second paper, Fcjcs Toth [159] showed that 
for a two-dimensional random vector that is uniformly 
distributed on the unit square, the mean squared er- 
ror of any N point quantizer is bounded from below by 
M (hexagon)///. This result was independently redcrived 
in a simpler fashion by Newman (1964) [385]. Clearly, the 
lower bound is asymptotically achievable by a lattice with 
hexagonal cells. It follows then that the ratio of 62(B) to 
M(hcxagon)cr 2 2~ 2R tends to one, and also, that Gcrsho's 
conjecture holds for dimension two. 

Zador's thesis (1963) [561] was the next rigorous work. 
As mentioned earlier, it contains two principal results. For 
fixed-rate quantization, rth power distortion measures of 
the form ||x - y\\ r and a source that is uniformly dis- 
tributed on the unit cube, it first shows (Lemma 2.3) that 
the operational distortion- rate function 12 6 k (N) multiplied 
by Ni approaches a limit b kfr as N 00. The basic 
idea, which Zador attributes to J. M. Hammcrslcy, is the 
following: For any positive integers N and n, divide the 
unit cube into n k subcubes, each with sides of length 
Clearly, the best code with N = n k N codcvcctors is at 
least as good as the code constructed by using the best 
code with N points for each subcubc. It follows then that 
S k {N) < Skin, N) = ±6k(N), where 6 k {n, N) is the opera- 
tional distortion-rate function of a source that is uniformly 
distributed on a subcubc and where the second relation fol- 
lows from the fact that this "sub" source is just a scaling 
of the original source. Multiplying both sides by yields 

Ni6 k (N)<Ni6 k (N). 

Thus we sec that increasing the number of codcvcctors from 
N to N = n k N docs not increase N*8 k (N). A somewhat 
more elaborate argument shows that this is approximately 
true for any sufficiently large N and, as a result, that 

]imsupNi6 k (N) < lim inf N*6 k (N)] 

TV-oo N ^°° 

i.e. N*6 k (N) has a limit. One can sec how the sclfsim- 
ilarity of the uniform density (it is divisible into similar 
subdensitics) plays a key role in this argument. Notice 
also that nowhere do the shapes of the cells or the point 
density enter into it. 

Zador next addresses nonuniform densities. With ||/|| 5 

denoting (/ f s (x)dx) 7 , his Theorem 2.2 shows that if the 
Jfc-dimcnsional source density satisfies H/ll^*- < 00 an ^ 
E [||X||*- 1+r " H ] < 00 for some c > 0, then 

Ni6 k (N)-+b ktr \\f\\^ 

12 We abuse notation slightly and let 8k{N) denote the least distor- 
tion of k-dimensional quantizers with N codevectors. 
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as N -> oo. The positive part, namely that 
limsupTV^(iV) < 6^11/H 

N—>oa *+ r 

is established by constructing codes in, approximately, the 
following manner: Given N, one chooses a sufficiently large 
support cube (large enough that overload distortion con- 
tributes little), subdivides the cube into n k equally sized 
subcubes, and places within each subcubc a set of code- 
vectors that arc optimal for the uniform distribution on 
that subcubc, where the number of codcvcctors in a sub- 
cubc is carefully chosen so that the point density in that 
subcubc approximates the optimal point density for the 
original source distribution. One then shows that the dis- 
tortion of this code, multiplied by is approximately 
^Jfc.rll/Uj^- The best codes arc at least this good and it 

follows that limsupfc^ N*6 k {N) < 6j b|r ||/|| jf fc_. One can 
easily sec how this construction creates codes with essen- 
tially optimal point density and cell shape. We will not 
describe the converse. 

Zador's 1966 Bell Labs Memorandum [562] reproves 
these two main results under weaker conditions. The dis- 
tortion measure is rth power in the general sense, which 
includes as special cases the narrow sense of the rth power 
of the Euclidean norm considered by Schutzcnbcrgcr [462]. 
The requirement on the source density is only that each 
of its marginals has the property that it is bounded from 
above by |z| r+c , for some c > 0 and all x of sufficiently 
large magnitude. This is a pure tail condition, as opposed 
to the finite moment condition of the thesis, which con- 
strains both the tail and the peak of the density. Note also 
that it no longer requires that ||/|| k be finite. 

As indicated earlier, Zador's memorandum also derives 
the asymptotic form of the operational distortion-rate func- 
tion of variable-rate quantization. In other words, it fin- 
ishes what his thesis and Schutzcnbcrgcr [462] started, 
though he was apparently unaware of the latter. Specif- 
ically, it shows that 



oo, 



where c kf r is some constant no larger than 6jb jr , assuming 
the same conditions as the fixed- rate result, plus the addi- 
tional requirement that for any c > 0 there is a bounded 
set containing all points x such that f(x) > c. 

Gish and Pierce (1968) [204], who discovered that uni- 
form is the asymptotically best type of scalar quantizer for 
variable-rate coding, presented both informal and rigorous 
derivations — the latter being the first to appear in these 
Transactions. Specifically, they showed rigorously that for 
uniform scalar quantization with infinitely many cells of 
width A, the distortion Da and the output entropy H& 
behave as follows: 



I^oA7l2 



= 1 



which makes rigorous the A 2 /12 formula and (11), respec- 
tively. For this result, they required the density to be con- 
tinuous except at finitely many points, and to satisfy a tail 
condition similar to Zador's and another condition about 
the behavior at points of discontinuity. The paper also out- 
lined a rigorous proof of (32) in the scalar case, i.e. that 
6\,i(R)/Z\ t i(R) — ► 1 as R — ► oo. But as to the details it 
ofFcrcd only that: "The complete proof is surprisingly long 
and will not be given here." Though Gish and Pierce were 
the first to informally derive (13), neither this paper nor 
any paper to date has provided a rigorous derivation. 

Eli as (1970) [143] also made a rigorous analysis of scalar 
quantization, giving asymptotic bounds to the distortion of 
scalar quantizers with a rather singularly defined measure 
of distortion, namely, the rth root of the average of the rth 
power of the cell widths. A companion paper [144] consid- 
ers similar bounds to the performance of vector quantizers 
with an analogous avcragc-ccll-sizc distortion measure. 

In 1973 Csiszar [114] presented a rigorous generalization 
of (52) to higher dimensional quantizers. Of most inter- 
est here is the following special case of his principal result 
(Theorem 1): Consider a k- dimensional source and a se- 
quence of Ar-dimcnsional quantizers <fi, <j2> • • • , where q n has 
a cquntably infinite number of cells, each with volume v n> 
where the t/ n 's and also the maximum of the cell diameters 
tends to zero. Then under certain conditions, including the 
condition that there be at least some quantizer with finite 
output entropy, the output entropy H n satisfies 



lim (H n + \ogv n ) = h(X). 

n— »oo ' 



(53) 



Clearly, this result applies to quantizers generated by lat- 
tices and, more generally, tessellations. It also applies to 
quantizers with finitely many cells for sources with compact 
support. But it docs not apply to quantizers with finitely 
many cells and sources with infinite support, because it 
docs not deal with the overload region of such quantizers. 

In 1977 Babkin ct al. [580] obtained results indicat- 
ing how rapidly the distortion of fixed-rate lattice quan- 
tizers approaches 6(R) as rate R and dimension k in- 
crease, for difference distortion measures. In 1978 these 
same authors [581] studied uniform scalar quantization 
with variable-rate coding, and extended Koshclcv's result 
to r-th power distortion measures. 

The next contribution is that of Bucklcw and Gallagher 
(1980) [63], who studied asymptotic properties of fixed- 
rate uniform scalar quantization. With An denoting the 
cell width that minimizes distortion among N cell uniform 
scalar quantizers and denoting the resulting minimum 
mean squared error, they showed that for a source with a 
Ricmann intcgrablc density f(x) 

lirr^ NA N = supp(f) and Jim N 2 D N = SUp ^ , 



/V— oo 



N— ►oo 



hm(H A +\ogA) = h(X), 



(5-^ where supp(f) is the length of the shortest interval (a, 6) 
with probability one. When the support is finite, i.e. a 
and 6 arc finite, the above implies Dn/(A%/12) — ► 1 as 

(52) TV — ► 00, and so Dn decreases as \/N 2 . This makes the 
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A 2 /12 formula rigorous in the finite iY case, at least when 
A is chosen optimally. However, when the support is in- 
finite, e.g. a Gaussian density, Dn decreases at a rate 
slower than 1/W 2 , and the resulting signal to noise ratio 
vs. rate curve separates from any line of slope 6 dB/bit. 
Consequently, the ratio of the operational distortion-rate 
functions of uniform and nonuniform scalar quantizers in- 
creases without bound as the rate increases; i.e. uniform 
quantization is asymptotically bad. Moreover, they showed 
that Djv/(Ajy/12) docs not always converge to 1. Instead 
liminfiv-,00 D N /{A 2 N /\2) > 1, and they exhibited densi- 
ties where the inequality is strict. In such cases the A 2 /12 
formula is invalidated by the heavy tails of the density. It 
was not until much later that the asymptotic form of An 
and Dn were found, as will be described later. 

Formal theory advanced further in papers by Bucklcw 
and Wise, Cambanis and Gcrr, and Bucklcw. The first 
of these (1982) [64] demonstrated Zador's fixed-rate re- 
sult for rth power distortion \\x - t/|| r , assuming only that 
E [\\X\\ r+s ] < oo for some 6 > 0. It also contained a 
generalization to random vectors without probability den- 
sities, i.e. with distributions that arc not absolutely con- 
tinuous or even continuous. The paper also gave the 
first rigorous approach to the derivation of Bennett's in- 
tegral for scalar quantization via companding. However, 
as pointed out by Lindcr (1991) [320], there was "a gap 
in the proof concerning the convergence of Ricmann sums 
with increasing support to a Ricmann integral." Lindcr 
fixed this and presented a correct derivation with weaker 
assumptions. Cambanis and Gcrr (1983) [70] claimed a 
similar result, but it had more restrictive conditions and 
suffered from the same sort of problems as [64]. A sub- 
sequent paper by Bucklcw (1984) [58] derived a result 
for vector quantizers that lies between Bennett's integral 
and Zador's formula. Specifically, it showed that when a 
sequence of quantizers is asymptotically optimal for one 
probability density fW(x), then its rth power distortion 
on a source with density / (2) (z) is asymptotically given 
by N^bks J X = fr(x)f^ 2 \x)dx, where X(x) is the optimal 
point density for f^(x). On the one hand, this is like Ben- 
nett's integral in that /^(x), and consequently A(x), can 
be arbitrary. On the other hand, it is like Zador's result (or 
Gcrsho's generalization of Bennett's integral [193]) in that, 
in essence, it is assumed that the quantizers have optimal 
cell shapes. 

In 1994 Lindcr and Zcgcr [326] rigorously derived the 
asymptotic distortion of quantizers generated by tessella- 
tions by showing that the quantizer q a formed by tcsscl- 
lating with some basic cell shape S scaled by a positive 
number a has average (narrow sense) rth-powcr distortion 
D a satisfying 

lim D * 

«->o a'vol(S) r / k M(S) 

They then combined the above with Csiszar's result (53) to 
show that under fairly weak conditions (finite differential 
entropy and finite output entropy for some a > 0) the out- 
put entropy H Q and the distortion D Q arc asymptotically 



related via 

r Da 

M(S)2( r /*)(*W-*0 ~ ' 

which is what Gcrsho derived informally [193]. 

The generalization of Bennett's integral to fixed-rate vec- 
tor quantizers with rather arbitrary cell shapes was ac- 
complished by Na and Ncuhoff (1995) [365], who presented 
both informal and rigorous derivations. In the rigorous 
derivations, it was shown that if a sequence of quantizers 
{?iv}> parameterized by the number of codcvcctors, has 
specific point density and specific incrtial profile converg- 
ing in probability to a model point density and a model 
incrtial profile, respectively, then N*D(qN) converges to 
Bennett's integral f m(x) A^ L (x) f(x) dx, where distortion 
is rth power ||x - y\\ r . A couple of additional conditions 
were also required, including one that is, implicitly, a tail 
condition. 

Though uniform scalar quantization with finitely many 
levels is the oldest and most elementary form of quantiza- 
tion, the asymptotic form of the optimal step size A^ and 
resulting mean squared error Dn has only recently been 
found for Gaussian and other densities with infinite sup- 
port. Specifically, Hui and Ncuhoff [253], [254], [255] have 
found that for a Gaussian density with variance cr 2 

Hm f A " = 1 and lim ^— 5^ = 

This result was independently found by Eriksson and 
Agrcll [149]. Moreover, it was shown that overload distor- 
tion is asymptotically negligible and that D N /{A 2 N /\2) 
1, which is the first time this has been proved for a source 
with infinite support. It follows from the above that signal 
to noise ratio increases as 6.02R - 10 log 10 R, which shows 
concretely how uniform scalar quantization is asymptoti- 
cally bad. Hui and Ncuhoff also considered nonGaussian 
sources and provided a fairly general characterization of 
the asymptotic form of An and Dn- It turned out that 
the overload distortion is asymptotically negligible when 
and only when the tail parameter r = lim^oo E ^ x \ x> v\ 
equals one, which is the case for all generalized Gaussian 
densities. For such cases, more accurate approximations 
to A;v and Dn can be given. For densities with r > 1, 
the ratio of overload to granular distortion is ^77, and 

Dn/ -J2- — > 23F- There arc even densities with tails so 
heavy that r = 2 and the granular distortion becomes neg- 
ligible in comparison to the overload distortion. In a related 
result, the asymptotic form of the optimal scaling factor for 
lattice quantizers has also been found recently for an i.i.d. 
Gaussian source [359], [149]. 

VVc conclude this subsection by mentioning some gaps in 
rigorous high resolution theory. One, of course, is a proof 
or countcrproof of Gcrsho's conjecture in dimensions three 
and higher. Another is the open question of whether the 
best tessellation in three or more dimensions is a lattice. 
Both of these arc apparently difficult questions. There have 
been no rigorous derivations of (11), or its extension to 
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higher dimensional tcssclations, where the quantizers have 
finitely many levels, and overload distortion must be dealt 
with. Likewise there have been no rigorous derivations of 
(13), or its higher dimensional generalization, except in the 
case where the point density is constant. Even assuming 
Gcrsho's conjecture is correct, there is no rigorous deriva- 
tion of the Zador-Gcrsho formulas (30) and (32) along the 
lines of the informal derivations that start with Bennett's 
integral. We also mention that the tail conditions given 
in some of the rigorous results (e.g. [58], [365]) arc very 
difficult to check. Simpler ones arc needed. Finally, as 
discussed in Section II there arc no convincing (let alone 
rigorous) asymptotic analyses of the operational distortion- 
rate function of DPCM. 

/. Comparing High Resolution Theory and Shannon Rate 
Distortion Theory 

It is interesting to compare and contrast the two princi- 
pal theories of quantization, and we shall do so in a number 
of different domains. 

Applicability: Sources 

Shannon rate distortion theory applies, fundamentally, 
to infinite sequences of random variables, i.e. to sources 
modelled as random processes. Its results derive from the 
frequencies with which events repeat, as expressed in a law 
of large numbers, such as the weak law or an crgodic the- 
orem. As such, it applies to sources that arc stationary 
in cither the strict sense or some weaker sense, such as 
asymptotic mean stationarity (cf. [218], p. 16). Though 
originally derived for crgodic sources, it has been extended 
to noncrgodic sources [221], [469], [126], [138], [479]. In 
contrast, high resolution theory applies, fundamentally, to 
finite-dimensional random vectors. However, for stationary 
(or asymptotically stationary) sources, taking limits yields 
results for random processes. For example, the operational 
distortion-rate function 6(R) was found to equal ~Z(R) in 
this way; sec (33). Rate distortion theory also has one re- 
sult relevant to finite-dimensional random vectors, namely, 
that the operational distortion-rate functions for fixed- and 
variable-rate quantization, 8 k (R) and Sk t i(R) } arc (strictly) 
bounded from below by the fcth-ordcr Shannon distortion- 
rate function. 

Both theories have been extended to continuous-time 
random processes. However, the high resolution results 
arc somewhat sketchy [43], [330], [204]. Both can be ap- 
plied to two or higher dimensional sources such as images 
or video. Both have been developed the most for Gaussian 
sources in the context of squarcd-crror distortion, which is 
not surprising in view of the tractability of squared error 
and Gaussianity. 

Applicability: Distortion Measures 

Shannon rate distortion theory applies primarily to ad- 
ditive distortion measures; i.e. distortion measures of the 



form 

k 

rf (*.y) = $^i(**iw) 

(or a normalized version), though there arc some results for 
subadditive distortion measures [340], [218] and some for 
distortion measures such as (x - y)' B x (x - y) [323]. High 
resolution theory has the most results for rth power dif- 
ference distortion measures, and as mentioned previously, 
some of its results have recently been extended to nondif- 
fcrcncc distortion measures such as (x — yyB x (x — y) [186], 
[316], [325]. In any event both theories arc the most fully 
developed for the squarcd-crror distortion measure, espe- 
cially for Gaussian sources. In addition, both theories re- 
quire a finite moment condition, specific to the distortion 
measure. For squarcd-crror distortion, it is simply that 
the variance of the source be finite. More generally, it 
is that E[d(X y y)] < oo for some y. In addition, as dis- 
cussed previously, rigorous high resolution theory results 
require tail conditions on the source density, for example, 
E [X 2+tf ] < oo for some 8 > 0. 

Complementarity 

The two theories arc complementary in the sense that 
Shannon rate distortion theory prescribes the best possible 
performance of quantizers with a given rate and asymp- 
totically large dimension, while high resolution theory pre- 
scribes the best possible performance of codes with a given 
dimension and asymptotically large rate. That is, for fixed- 
rate codes 

8 k (R) 2 D(R) for large k and any R (54) 
S h (R) S Zk(R) for large R and any k. (55) 

and similarly for variable-rate codes 

h,L(R) = D(R) for large k and any L, R (56) 
fa ,l( R ) - Z kiL {R) for large R and any fc, L. (57) 

When both dimension and rate arc large, they all give the 
same result, i.e., 

W) = 6 kjL (R) Q< 8(R) S D(R). 

Rates of Convergence 

It is useful to know how large R and k must be, respec- 
tively, for high resolution and rate distortion theory for- 
mulas to be accurate. As a rule of thumb, high resolution 
theory is fairly accurate for rates greater than or equal to 
about 3. And it is sufficiently accurate at rates about 2 for 
it to be useful when comparing different sources and codes. 
For example, Figure 7 shows signal-to-noisc ratios for fixed- 
rate quantizers produced by conventional design algorithms 
and predictions thereof based on the Zador-Gcrsho function 
Zk(R)j for two Gaussian sources: i.i.d. and Markov with 
correlation coefficient 0.9. It is apparent from data such 
as this that the accuracy of the Zador-Gcrsho function ap- 
proximation to 8k(R) increases with dimension. 
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Fie 7 Signal-to-noise ratios for optimal VQs (dots) and predictions 
thereofbased on the Zador-Gersho formula (straight hnes). 

The convergence rate of 6 k (R) to S(R) as k tends to infin- 
ity has also been studied [413], [548], [321], [576]. Roughly 
speaking these results show that for mcmorylcss sources 
the convergence rate is between yffi and Sfi. Unfortu- 
nately, this theory docs not enable one to actually predict 
how large the dimension must be in order that 6 k {R) is 
within some specified percentage, e.g. 10%, of 6{R). How- 
ever, one may use high resolution theory to do this by 
comparing M k /3 k (or M k y kL in the variable-rate case) to 
M0 For example, for the i.i.d. Gaussian source Figure 5 
shows that 6 k {R) yields distortions within l and .2 dB of 
that predicted by 6{R) at dimensions 12 and 100, respec- 
tively For sources with memory, the dimension needs to 
be larger, by roughly the effective memory length. One 
may conclude that the Shannon distortion-rate function 
approximation to S k (R) is applicable for moderate to large 
dimensions k. 

Quantitative Relationships 

For squarcd-crror distortion, the Zador-Gersho function 
Z(R) is precisely equal to the well known Shannon lower 
bound DsUR) to the Shannon distortion-rate function. It 
follows that' when rate is not large, Z(R) is, at least, a 
lower bound to 5(R). Similarly, the Shannon lower bound 
D ah<k (R) to the Hh-ordcr Shannon distortion-rate func- 
tion equals 2 M (fi)£,ta which it follows that D Rlh , k (R) 
may be thought of as the distortion of a fictional quantizer 
having the distortion of an optimal fc-dimcnsional variablc- 
ratc quantizer with first-order entropy coding, except that 
its cells have the normalized moment of inertia of a high 
dimensional sphere instead of M k . It is well known that 
D s \h(R)/D{R) approaches one as R increases [ill], 
[46], [322], which is entirely consistent with the fact that 
Z(R)/6(R) approaches one as R increases. The relation- 
ships among the various distortion-rate functions arc sum- 
marized below. Inequalities marked with a "•" become 
tight as dimension k increases, and those marked with a 
"+" become tight as R increases. 



Applicability: Quantizer Types 

Rate distortion theory finds the performance of the best 
quantizers of any type for stationary sources. It has noth- 
ing to say about suboptimal, structured or dimension- 
constrained quantizers except, as mentioned earlier, that 
quantizers of dimension k have distortion bounded from 
below by the Hh-ordcr Shannon distortion-rate function. 
In contrast, high resolution theory can be used to ana- 
lyze and optimize the performance of a number of families 
of structured quantizers, such as transform, lattice, prod- 
uct, polar, two-stage and, most directly, dimension- con- 
strained quantizers. Such analyses arc typically based on 
Bennett's integral. Indeed, the ability to analyze struc- 
tured or dimension-constrained quantizers is the true forte 
of high resolution theory. 



Performance vs. Complexity 

Assessing performance vs. complexity should be a major 
goal of quantization theory. On the one hand, rate distor- 
tion theory specifics the fundamental limits to performance 
without regard to complexity. On the other hand, because 
high resolution theory can analyze the performance of fam- 
ilies of quantizers with complexity reducing structure, one 
can learn much from it about how complexity relates to 
performance. In recent work, Hui and Ncuhoff [256] hayc 
combined high resolution theory and Turing complexity 
theory to show that asymptotically optimal quantization 
can be implemented with complexity increasing at most 
poly normally with the rate. 
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Computability 

First-order Shannon distortion-rate functions can be 
computed analytically for squared error and magnitude 
error and several source densites, such as Gaussian and 
Laplacian, and for some discrete sources, cf. |4ftJ, .[4SWJ, 
[560] [217] For other sources it can be computed with 
Blahut's algorithm [52]. And in the case of squared error, it 
can be computed with simpler algorithms [168], [444]. For 
sources with memory, complete analytical formulas for Hh- 
ordcr distortion-rate functions arc known only for Gaussian 
sources. For other cases, the Blahut algorithm [52] can be 
used to compute D k (R), though its computational com- 
plexity becomes overwhelming unless k is small. Due to the 
difficulty of computing it, many (mostly lower) bounds to 
the Shannon distortion-rate function have been developed 
which for reasonably general cases yield the dirtort(»-n*c 
function exactly for a region of small distortion (cf. 1465], 
[327], [267], [239], [46], [212], [550], [559], [217]). An impor- 
tant upper bound derives from the fact that with respect 
to squared error, the Gaussian source has the largest Shan- 
non distortion-rate function (Hh-ordcr or in the limit) of 
any source with the same covariancc function. 

To compute a Zador-Gersho function, one needs to find 
M k and cither ft or 7 * in the fixed- and variable-rate cases, 
respectively. Though M k is known only for k < 2 there 
arc bounds for other values of fc. One lower bound is the 
normalized moment of inertia of a sphere of the same di- 
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Another bound is given in [106]. One upper bound was 
developed by Zador; others derive from the currently best 
known tessellations (cf. [106], [5]). The Zador factors fa 
and 7jt can be computed straightforwardly for k = 1 and, 
also, for k >2 for i.i.d. sources. In some cases, simple 
closed form expressions can be found, e.g. for Gaussian, 
Laplacian, gamma densities. In other cases numerical inte- 
gration can be used. Upper bounds to fix arc given in [294]. 
To the authors' knowledge, for sources with memory, sim- 
ple expressions for the Zador factors have been found only 
for Gaussian sources; they depend on the covariancc ma- 
trix. 

Underlying Principles 

Rate distortion theory is a deep and elegant theory based 
on the law of large numbers and the key information theo- 
retic property that derives from it, namely, the AEP. High 
resolution theory is a simpler, less elegant theory based 
on geometric characterizations and integral approximations 
over fine partitions. 

Siblings 

Lossless source coding and channel coding arc sibling 
branches of information theory, also based on the law of 
large numbers and the asymptotic cquipartition property. 
Siblings of high resolution theory include error probability 
analyses in digital modulation and channel coding based 
on minimum distance and a high signal- to-noisc ratio as- 
sumption, and the average power analyses for the additive 
Gaussian channel based on the continuous approximation. 

Code Design Philosophy 

Neither theory is ordinarily considered to be construc- 
tive, yet each leads to its own design philosophy. Rate dis- 
tortion theory shows that, with high probability, a good 
high dimensional quantizer can be constructed by ran- 
domly choosing codcvcctors according to the output dis- 
tribution of the test channel that achieves the Shannon 
rate-distortion function. As a construction technique, this 
leaves much to be desired because the dimension of such 
codes is large enough that the codes so constructed arc 
completely impractical. On the other hand the AEP in- 
dicates that such codcvcctors will be roughly uniformly 
distributed over a "typical" set, and this leads to the de- 
sign philosophy that a good code has its codcvcctors uni- 
formly distributed throughout this set. In the special case 
of squared error distortion and an i.i.d. Gaussian source 
with variance a 2 , the output distribution is i.i.d. Gaus- 
sian with variance o 2 - D(R); the typica l set is a thin 
shell near the surface of a sphere of radius y/k(a 2 - D(R)); 
and a good code has its codcvcctors uniformly distributed 
on this shell. Since the interior volume of such a (high- 
dimensional) sphere is negligible, it is equally valid for 
the codcvcctors to be uniformly distributed throughout the 



sphere. For other sources, the codcvcctors will be uniformly 
distributed over some subset of the shell. 

High resolution theory indicates that for large rate and 
arbitrary dimension k t the quantization cells should be as 
spherical as possible — preferably shaped like 2* , with nor- 
malized moment of inertia M*. Moreover, the codcvcctors 
should be distributed according to the optimal point den- 
sity A£. Thus, high resolution theory yields a very clear 
design philosophy. In the scalar case, one can use this 
philosophy directly to construct a good quantizer, by de- 
signing a compander whose noniincarity c(x) has derivative 
A* (a), and extracting the resulting reconstruction levels 
and thresholds to obtain an approximately optimal point 
quantizer. This was first mentioned in Pantcr-Ditc [405] 
and rediscovered several times. Unfortunately, at higher di- 
mensions, companders cannot implement an optimal point 
density without creating large oblongitis [193], [56], [57]. 
So there is no direct way to construct optimal vector quan- 
tizers with the high resolution philosophy. 

When dimension as well as rate is large, the two philoso- 
phies merge because the output distribution that achieves 
the Shannon distortion-rate function converges to the 
source density itself, as docs the optimal point density. 
However, for small to moderate values of fc, \* k specifics a 
better, distribution of points than the rate-distortion phi- 
losophy of uniformly distributing codcvcctors over the typ- 
ical set. For example, in the i.i.d.Gaussian case it indi- 
cates that the point density should be a Gaussian hill with 
somewhat larger variance than that of the source density. 
Which design philosophy is more useful? At low rates (say 
1 bit per sample or less), one has no choice but to look 
to rate distortion theory. But at moderate to high rates, 
it appears that the high resolution design philosophy is 
the better choice. To sec this consider an i.i.d. Gaussian 
source, a target rate R and a ^-dimensional quantizer with 
2 kR points uniformly distributed throughout a spherical 
support region. This is the ideal code suggested by rate 
distortion theory. One obtains a lower bound to its distor- 
tion by assuming that source vectors outside the support 
region arc quantized to the closest point on the surface 
of the sphere, and by assuming that the cells within the 
support region arc ib-dimcnsional spheres. In this case, 
at moderate to large rates (say rate 10), after choosing 
the diameter of the support region to minimize this lower 
bound, it has been found that the dimension k must be 
larger than 250 in order that the resulting signal to noise 
ratio be within 1 dB of that predicted by the Shannon 
distortion- rate function [25]. Similar results were reported 
by Pepin ct al. [409]. On the other hand, as mentioned ear- 
lier, a quantizer with dimension 12 can achieve this same 
distortion. It is clear then that the ability to come fairly 
close to 6(R) with moderately large dimension is not due 
to the rate distortion theory design philosophy, the AEP, 
nor the use of spherical codes. Rather it is due to the fact 
that good codes with small to moderate dimension have 
appropriately tapered point densities, as suggested by high 
resolution theory. 

Finally, it is interesting to note that high resolution the- 
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ory actually contains some analyses of the Shannon random 
coding approach. For example, Zador's thesis [561] gives 
an upper bound on the distortion of a randomly generated 
vector quantizer. 

Nature of the Error Process 

Both theories have something to say about the distri- 
bution of quantization errors. Generally speaking, what 
rate distortion theory has to say comes from assuming that 
the error distribution caused by a quantizer whose per- 
formance is close to S(R) is similar to that caused by a 
test channel that comes close to achieving the Shannon 
distortion-rate function. This is reasonable because Shan- 
non's random coding argument shows that using such a 
test channel to randomly generate high dimensional code- 
vectors leads, with very high probability, to a code whose 
distortion is close to 6(R). For example, one may use this 
sort of argument to deduce that the quantization error of 
a good high dimensional quantizer is approximately white 
and Gaussian when the source is mcmorylcss, the distor- 
tion is squared error, and the rate is large, cf. [404], which 
shows Gaussian-like histograms for the quantization error 
of VQ's with dimensions 8 to 32. As another example, for 
a Gaussian source with memory and squarcd-crror distor- 
tion, rate distortion theory shows there is a simple relation 
between the spectra of the source and the spectra of the 
error produced by an optimal high dimensional quantizer, 
cf. [46]. 

High resolution theory also has a long tradition of ana- 
lyzing the error process, beginning with Clavier ct al. [99], 
[100] and Bennett [43] and focusing on the distribution of 
the error, its spectrum and its correlation with the input. 
Bennett showed that in the high resolution case, the power 
spectral density of the quantizer error with uniform quanti- 
zation is approximately white (and uniformly distributed) 
provided the assumptions of the high resolution theory 
arc met and the joint density of sample pairs is smooth. 
(Sec also Section 5.6 of [196].) Bennett also found exact 
expressions for the power spectral density of a uniformly 
quantized Gaussian process. Sripad and Snyder [477] and 
Ciaascn and Jongcpicr [97] derived conditions under which 
the quantization error is white in terms of the joint char- 
acteristic functions of pairs of samples, two-dimensional 
analogs of Widrow's [529] condition. Zador [562] found 
high resolution expressions for the characteristic function of 
the error produced by randomly chosen vector quantizers. 
Lee and NcuhofF [312], [379] found high resolution expres- 
sions for the density of the error produced by fairly gen- 
eral (deterministic) scalar and vector quantizers in terms 
of their point density and their shape profile, which is a 
function that conveys more cell shape information than the 
incrtial profile. As aside benefit, these expressions indicate 
that much can be deduced about the point density and cell 
shapes of a quantizer from a histogram of the lengths of the 
errors. Zamir and Fcdcr [564] showed that the error pro- 
duced by an optimal lattice quantizer with infinitely many 
small cells is asymptotically white in the sense that its 
components arc uncorrected with zero means and identical 



variances. Moreover they showed that it becomes Gaussian 
as the dimension increases. The basic ideas arc that as di- 
mension increases good lattices have nearly spherical cells 
and that a uniform distribution over a high dimensional 
sphere is approximately Gaussian, cf. [525]. Since optimal 
high dimensional, high rate VQs can also be expected to 
have nearly spherical cells and since the AEP implies that 
most cells will have the same size, we reach the same con- 
clusion as from rate distortion theory, namely that good 
high-rate, high-dimensional codes cause the quantization 
error to be approximately white and Gaussian. 



Successive Approximation 

Many vector quantizers operate in a successive approx- 
imation or progressive fashion, whereby a low rate coarse 
quantization is followed by a sequence of finer and finer 
quantizations, which add to the rate. Tree-structured, mul- 
tistage and hierarchical quantizers, to be discussed in the 
next section, arc examples of such. Other methods can be 
used to design progressive indexing into given codebooks, 
as in Yamada and Tazaki (1991) [553] and Riskin ct al. 
(1994) [440] 

Successive approximation is useful in situations where 
the decoder needs to produce rough approximations of the 
data from the first bits it receives and, subsequently, to 
refine the approximation as more bits arc received. More- 
over, successive approximation quantizers arc often struc- 
tured in a way that makes them simpler than unstructured 
ones. Indeed, the three examples just cited arc known more 
for their good performance with low complexity than for 
their progressive nature. An important question is whether 
the performance of a successive refinement quantizer will 
be better than one that docs quantization in one step. On 
the one hand, rate distortion theory analysis [228], [291], 
[292], [557], [147], [437], [96] has shown that there arc sit- 
uations where successive approximation can be done with- 
out loss of optimality. On the other hand, high resolution 
analyses of TSVQ [383] and two-stage VQ [311] have quan- 
tified the loss of these particular codes, and in the latter 
case shown ways of modifying the quantizer to eliminate 
the loss. Thus, both theories have something to say about 
successive refinement. 

V. Quantization Techniques 
This section presents an overview of quantization tech- 
niques (mainly vector) that have been introduced, begin- 
ning in the 1980's, with the goal of attaining rate/distortion 
performance better than that attainable by scalar based 
techniques such as direct scalar quantization, DPCM, and 
transform coding, but without the inordinately large com- 
plexity of brute force vector quantization methods. Recall 
that if the dimension of the source vector is fixed, say at 
k, then the goal is to attain performance close to the op- 
timal performance as expressed by 6 k {R) in the fixed rate 
case, or 6 k , L (R) (usually S k)l (R)) in the general case where 
variable-rate codes arc permitted. However if, as in the 
case of a stationary source, the dimension k can be chosen 
arbitrarily, then in both the fixed and variable-rate cases, 
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the goal is to attain performance close to 6(R). In this case 
all quantizers with R > 0 arc suboptimal, and quantizers 
with various dimensions and even memory (which blurs the 
notion of dimension) can be considered. 

We would have liked to make a carefully categorized, 
ordered and ranked presentation of the various methods. 
However, the literature and variety of such techniques is 
quite large; there arc a number of competing ways in which 
to categorize the techniques; complexity is itself a difficult 
thing to quantify; there arc several special cases (e.g. fixed 
or variable rate, and fixed or choosablc dimension); and 
there has not been much theoretical or even quantitative 
comparison among them. Consequently, much work is still 
needed in sorting the wheat from the chaff, i.e. determining 
which methods give the best performance vs. complexity 
tradeoff in which situations, and in gaining an understand- 
ing of why certain complexity reducing approaches arc bet- 
ter than others. Nevertheless we have attempted to choose 
a reasonable set of techniques and an ordering of them for 
discussion. Where possible we will make comments about 
the efficacies of the techniques. In all cases, we include 
references. 

We begin with a brief discussion of complexity. Roughly 
speaking it has two aspects: arithmetic (or computational) 
complexity, which is the number of arithmetic operations 
per sample that must be performed when encoding or 
decoding, and storage (or memory or space) complexity, 
which is the amount of auxiliary storage (for example 
of codebooks) that is required for encoding or decoding. 
Rather than trying to combine them, it makes sense to 
keep separate track, because their associated costs vary 
with implementation venue, e.g. a PC, UNIX platform, 
generic DSP chip, specially designed VLSI chip, etc. In 
some venues, storage is of such low cost that one is tempted 
to ignore it. However, there arc techniques that benefit suf- 
ficiently from increased memory that even though the per 
unit cost is trivial, to obtain the best performance com- 
plexity tradeoff, memory usage should be increased until 
the marginal gain-to-cost ratio of further increases is small, 
at which point the total cost of memory may be signficant. 
As a result one might think of a quantizer as being charac- 
terized by a four-tuple (fl,L>,vl,M); i.e. arithmetic com- 
plexity A and storage complexity M have been added to 
the usual rate R and distortion D. 

As a reminder, given a ^-dimensional fixed-rate VQ with 
codebook C containing 2 kR codcvcctors, brute force fall 
search encoding finds the closest codcvcctor in C by com- 
puting the distortion between x and each codcvcctor. In 
other words, it uses the optimal lossy encoder for the given 
codebook, creating the Voronoi partition. In the case 
of squared error, this requires computing approximately 
A = 3x2 fcft operations per sample and storing approx- 
imately M = k x 2 kR vector components. For example, 
a codebook with rate .25 bits per pixel (bpp) and vector 
dimension 8 x 8 = 64 has 2 kR = 2 16 codcvcctors, an im- 
practical number for, say, real time video coding. This ex- 
ponential explosion of complexity and memory can cause 
serious problems even for modest dimension and rate, but 



it can in general make codes completely impractical in ci- 
ther the high resolution or high dimension extremes. A 
brute force variable-rate scheme of the same rate will be 
even more complex — typically involving a much greater 
number of codcvcctors, a Lagrangian distortion computa- 
tion, and an entropy coding scheme as well. It is the high 
complexity of such brute force techniques that motivates 
the reduced complexity techniques to be discussed later in 
this section. 

Simple measures such as arithmetic complexity and stor- 
age need a number of qualifications. One must decide 
whether encoding and decoding complexities need to be 
counted separately or summed, or indeed whether only one 
of them is important. For example, in rccord-oncc-pl ay- 
many situations, it is the decoder that must have low com- 
plexity. Having no particular application in mind, we will 
focus on the sum of encoder and decoder complexities. For 
some techniques (perhaps most) it is possible to trade com- 
putations for storage by the use of prccomputcd tables. In 
such cases a quantizer is characterized, not by a single A 
and M but by a curve of such. In some cases, a given set 
of prccomputcd tables is the heart of the method. Another 
issue is the cost of memory accesses. Such operations arc 
usually signficantly less expensive than arithmetic opera- 
tions. However, some methods do such a good job of reduc- 
ing arithmetic operations that the cost of memory accesses 
becomes significant. Techniques that attain smaller values 
of distortion need higher precision in their arithmetic and 
storage, which though not usually accounted for in assess- 
ments of complexity may sometimes be of significance. For 
example, a recent study of VQ codebook storage has shown 
that in routine cases one needs to store codcvcctor compo- 
nents with only about i*-f 4 bits per component, where R is 
the rate of the quantizer [252]. Though this study did not 
assess the required arithmetic precision, one would guess 
that it need not be more than a little larger than that of 
the storage; e.g. R plus 5 or 6 bit arithmetic should suf- 
fice. Finally, variable-rate coding raises additional issues 
such as the costs associated with buffering, with storing 
and accessing variable-length codewords, and with the de- 
coder having to parse binary sequences into variable-length 
codewords. 

When assessing complexity of a quantization technique, 
it is interesting to compare the complexity invested in 
the lossy encoder/decoder vs. that in the lossless en- 
coder/decoder. (Recall that good performance can the- 
oretically be attained with cither a simple lossy encoder, 
such as a uniform scalar quantizer, and a sophisticated loss- 
less encoder or vice versa, as in high dimensional fixed-rate 
VQ.) A quantizer is considered to have low complexity only 
when both encoders have low complexity. In the discussion 
that follows wc focus mainly on quantization techniques 
where the lossless encoder is conceptually if not quantita- 
tively simple. Wc wish, however, to mention the indexing 
problem, which may be considered to lie between the loss- 
less and the lossy encoder. There arc certain fixed-rate 
techniques such as lattice quantization, pyramid VQ, and 
scalar- vector quantization, where it is fairly easy to find 
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the cell in which the source vector lies, but the cells arc 
associated with some set of N indices that arc not simply 
the integers from 1 to N, where N is the number of cells, 
and converting the identity of the cell into a sequence of 
log N bits is nontrivial. This is referred to as an indexing 
problem. 

Finally, we mention two additional issues. The first is 
that there arc some VQ techniques whose implementation 
complexities arc not prohibitive, but which have sufficiently 
many codcvcctors that designing them is inordinately com- 
plex or requires an inordinate amount of training data. A 
second issue is that in some applications it is desirable that 
the output of the encoder be progressively dccodablc in the 
sense that a rough reproduction can be made from the first 
bits that it receives, and improved reproductions arc made 
as more bits arc received. Such quantizers arc said to be 
progressive or embedded. Now it is true that a progressive 
decoder can be designed for any encoder (for example, it 
can compute the expected value of the source vector given 
whatever bits it has received so far). However, a "good" 
progressive code is one for which the intermediate distor- 
tions achieved at the intermediate rates arc relatively good 
(though not usually as good as those of quantizers designed 
for one specific rate) and that rather than restarting from 
scratch every time the decoder receives a new bit (or group 
of bits), it uses some simple method to update the current 
reproduction. It is also desirable in. some applications for 
the encoding to be progressive, as well. Though not de- 
signed with them in mind, it turns out that a number of 
the reduced complexity VQ approaches also address these 
last two issues. That is, they arc easier to design, as well 
as progressive. 

A. Fast Searches of Unstructured Codebooks 

Many techniques have been developed for speeding the 
full (minimum distortion) search of an arbitrary codebook 
C containing N A:- dimensional codcvcctors, for example 
one generated by a Lloyd algorithm. In contrast to code- 
books to be considered later these will be called unstruc- 
tured. As a group these techniques use substantial amounts 
of additional memory in order to significantly reduce arith- 
metic complexity. A variety of such techniques arc men- 
tioned in Section 12.16 of [196]. 

A number of fast search techniques arc similar in spirit to 
the following: the Euclidean distances between all pairs of 
codcvcctors arc prccomputcd and stored in a table. Now, 
given a source vector x to quantize, some initial codcvcctor 
y is chosen. Then all codcvcctors y; whose distance from y 
is greater than 2||x-y|| arc eliminated from further consid- 
eration because they cannot be closer than y. Those not 
eliminated arc successively compared to x until one that 
is closer than y is found, which then replaces y, and the 
process continues. In this way the set of potential code- 
vectors is gradually narrowed. Techniques in this category, 
with different ways of narrowing the search, may be found 
in [362], [517], [475], [476], [363], [426], [249], [399], [273], 
[245], [229], [332], [307], [547], [308], [493]. 

A number of other fast search techniques begin with a 



"coarse" prcquantization with some very low complexity 
technique. It is called "coarse" because it typically has 
larger cells than the Voronoi regions of the codebook C 
that is being searched. The coarse prcquantization of- 
ten involves scalar quantization of some type or a tree- 
structuring of binary quantizers, such as what arc called 
K-d trees. Associated with each coarse cell is a bucket con- 
taining the indices of each codcvcctor that is the nearest 
codcvcctor to some source vector in the cell. These buck- 
ets arc determined in advance and saved as tables. Then 
to encode a source vector x, one applies the prcquantiza- 
tion, finds the index of the prcquantization cell in which x 
is contained, and performs a full search on the correspond- 
ing bucket for the closest codcvcctor to x. Techniques of 
this type may be found in [44], [176], [88], [89], [334], [146], 
[532], [423], [415], [500], [84]. In some of these, the coarse 
prcquantization is onc-dimcnsional; for example, the length 
of the source vector may be quantized, and then the bucket 
of all codcvcctors having similar lengths is searched for the 
closest codcvcctor. 

Another class of techniques is like the previous except 
that the low complexity prcquantization has much smaller 
cells than the Voronoi cells of C, i.e. it is finer. In this 
case, the buckets associated with most "fine" prcquantiza- 
tion cells contain just one codcvcctor, i.e. the same code- 
vector in C is the closest codcvcctor to each point in the 
fine cell. The indices of these codcvcctors, one for each 
fine cell, arc stored in a prccomputcd tabic. For each of 
those relatively few fine cells that have buckets contain- 
ing more than one codcvcctor, one member of the bucket 
is chosen and its index is placed in the tabic as the entry 
for that fine cell. Quantization of x then proceeds by ap- 
plying the fine prcquantizcr and then using the index of 
the fine cell in which x lies to address the tabic contain- 
ing codcvcctors from C, which then outputs the index of 
a codeword in C. Due to the fact that not every bucket 
contains only one codcvcctor, such techniques, which may 
be found in [86], [358], [357], [518], [75], [219], do not do 
a perfect full search. Some quantitative analysis of the 
increased distortion is given in [356] for a case where the 
prcquantization is a lattice quantizer. Other fast search 
methods include the partial distortion method of [88], [39], 
[402] and the transform subspacc domain approach of [78]. 

Consideration of methods based on prcquantization leads 
to the question of how fine the prcquantization cells should 
be. Our experience is that the best tradeoffs come when 
the prcquantization cells arc finer rather than coarser, the 
explanation being that if one has prequantized coarsely and 
now has to determine which codcvcctor in a bucket is clos- 
est to x, it is more efficient to use some fast search method 
than to do full search. Dividing the coarse cells into finer 
ones is a way of doing just this. Another question that 
arises for all fast search techniques is whether it is worth the 
effort to perform a full search or whether one should instead 
stop short of this, as in the methods with fine prcquantiza- 
tion cells. Our experience is that it is usually not worth the 
effort to do a full search, because by suffering only a very 
small increase in MSE one can achieve a significant rcduc- 
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tion in arithmetic complexity and storage. Moreover, in the 
case of stationary sources where the dimension is subject- 
to choice, for a given amount of arithmetic complexity and 
storage, one almost always gets better performance by do- 
ing a suboptimal search of a higher dimensional codebook 
than a full search of a lower dimensional one. 

Fast search methods based on fine prcquantization can 
be improved by optimizing the codebook for the given pre- 
quantizer. Each cell of the partition corresponding to C 
induced by prcquantization followed by table lookup is the 
union of some number fine cells of the prcquantizcr. Thus 
the question becomes: what is the best partition into N 
cells, each of which is the union of some number of fine cells. 
The codcvcctors in C should then be the ccntroids of these 
cells. Such techniques have been exploited in [86], [358]. 
One technique worth particular mention is called hierar- 
chical tabic lookup VQ [86], [518], [75], [219]. In this case, 
the prcquantizcr is itself an unstructured codebook that is 
searched with a fine prcquantizcr that is in turn searched 
with an even finer prcquantizcr, and so on. Specifically, the 
first prcquantizcr uses a high rate scalar quantizer k times. 
The next level of prcquantization applies a two-dimensional 
VQ to each oik/2 pairs of scalar quantizer outputs. The 
next level applies a four-dimensional VQ to each of k/4 
pairs of outputs from the two-dimensional quantizers, and 
so on. Hence the method is hierarchical. Because each 
of the quantizers can be implemented entirely with table 
look up, this method eliminates all arithmetic complexity 
except memory accesses. It has been successfully used for 
video coding [518], [75]. 

B. Structured Quantizers 

We now turn to quantizers with structured partitions 
or reproduction codebooks, which in turn lend themselves 
to fast searching techniques and, in some cases, to greatly 
reduced storage. Many of these techniques arc discussed 
in [196], [458]. 

Lattice Quantizers 

Lattice quantization can be viewed as a vector general- 
ization of uniform scalar quantization. It constrains the 
reproduction codebook to be a subset of a regular lat- 
tice, where a lattice is the set of all vectors of the form 
X2?=i rn i u ii where ra t - arc integers and the tt t - arc linearly 
independent (usually non degenerate, i.e., n = k). The re- 
sulting Voronoi partition is a tessellation with all cells (ex- 
cept for those overlapping the overload region) having the 
same shape, size and orientation. Lattice quantization was 
proposed by Gcrsho [193] because of its near optimality 
for high resolution variable-rate quantization and, also, its 
near optimality for high resolution fixed-rate quantization 
of uniformly distributed sources. (These assume that Gcr- 
sho's conjecture holds and that the best lattice quantizer 
is approximately as good as the best tessellation.) Espe- 
cially important is the fact that their highly structured 
nature has lead to algorithms for implementing their lossy 
encoders with very low arithmetic and stoargc complexi- 
ties [103], [104], [105], [459], [106], [199]. These find the 



integers m f - associated with the closest lattice point. Con- 
way and Sloanc [104], [106] have reported the best known 
lattices for several dimensions, as well as fast quantizing 
and decoding algorithms. Some important n-dimcnsional 
lattices arc the root lattices A n (n > 1), D n (n > 2), and 
E n (n = 6,7,8), the Barnes- Wall lattice Ais in dimension 
16, and the Leech lattice A24 in 24 dimensions. These lat- 
ter give the best sphere packings and coverings in their 
respective dimensions. Recently, Agrcll and Eriksson [5] 
have found improved lattices in dimensions 9 and 10. 

Though low complexity algorithms have been found for 
the lossy encoder, there arc other issues that affect the 
performance and complexity of lattice quantizers. For 
variable-rate coding, one must scale the lattice to obtain 
the desired distortion and rate,' and one must implement 
an algorithm for mapping the m^s to the variable-length 
binary codewords. The latter could potentially add much 
complexity. For fixed-rate coding with rate R, the lattice 
must be scaled and a subset of 2 kR lattice points must 
be identified as the codcvcctors. This induces a support 
region. If the source has finite support, the lattice quan- 
tizer will ordinarily be chosen to have the same support. If 
not, then the scaling factor and lattice subset arc usually 
chosen so that the resulting quantizer support region has 
large probability. In cither case a low complexity method is 
needed for assigning binary sequences to the chosen codc- 
vcctors; i.e. for indexing. Conway and Sloanc [105] found 
such a method for the important case that the support has 
the shape of an enlarged cell. For sources with infinite 
support, such as i.i.d. Gaussian, there is also the difficult 
question of how to quantize a source vector x lying out- 
side the support region. For example, one might scale x so 
that it lies on or just inside the boundary of the support 
region, and then quantize the scaled vector in the usual 
way. Unfortunately, this simple method docs not always 
find the closest code vector to x. Indeed, it often increases 
overload distortion substantially over that of the minimum- 
distance quantization rule. To date, there is apparently no 
low complexity method that docs not substantially increase 
overload distortion. 

High resolution theory applies immediately to lattice VQ 
when the entire lattice is considered to be the codebook. 
The theory becomes more difficult if, as is usually the case, 
only a bounded portion of the lattice is used as the code- 
book and one must separately consider granular and over- 
load distortion. There arc a variety of ways of considering 
the tradeoffs involved, cf. [580], [151], [359], [149], [409]. 
In any case, the essence of a lattice code is its uniform 
point density and nicely shaped cells with low normalized 
moment of inertia. For fixed-rate coding, they work well 
for uniform sources or other sources with bounded sup- 
port. But as discussed earlier, for sources with unbounded 
support such as i.i.d. Gaussian, they require very large 
dimensions to achieve performance close to S(R). 

Product Quantizers 

A product quantizer uses a reproduction codebook that 
is the cartesian product of lower dimensional reproduction 
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codcbooks. For example, the application of a scalar quan- 
tizer to k successive samples X\ , X2, • • • , Xk can be viewed 
as a product quantizer operating on the A;- dimensional vec- 
tor X = (Xi ) X2 i .. 1 Xk)- The product structure makes 
searching easier and, unlike the special case of a sequence 
of scalar quantizers, the search need not be comprised of 
k independent searches. Products of vector quantizers arc 
also possible. Typically, the product quantizer is applied, 
not to the original vector of samples, but to some functions 
or features extracted from the vector. The complexities of 
a product quantizer (arithmetic and storage, encoding and 
decoding) arc the sums of those of the component quan- 
tizers. As such they arc ordinarily much less than the 
complexities of an unstructured quantizer with the same 
number of codcvcctors, whose complexities equal the prod- 
uct of those of the components of a product quantizer. 

A shape-gain vector quantizer [449], [450] is an exam- 
ple of a product quantizer. It uses a product reproduc- 
tion codcbook consisting of a gain codcbook C g = i = 
1, . . . , N g ) of positive scalars and a shape codcbook C s = 
\sj\j = l,...,iV 5 } of unit norm fc-dimcnsional vectors, 
and the overall reproduction vector is defined by x = gs. 
It is easy to sec the minimum squared error reproduction 
codeword giij for an input vector x is found by the follow- 
ing encoding algorithm: First choose the index j that max- 
imizes the correlation x t Sj ) then for this chosen j choose 
the index i minimizing \§i-x t Sj\. This sequential rule gives 
the minimum squared error reproduction codeword with- 
out explicitly normalizing the input vector (which would 
be computationally expensive). The encoder and decoder 
arc depicted in Figure 8. 
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Fig. 8. Shape-Gain VQ 



A potential advantage of such a system is that by sep- 
arating these two 'features," one is able to use a scalar 
quantizer for the gain feature and a lower rate codcbook 
for the shape feature, which can then have a higher dimen- 
sion, for the same search complexity. A major issue arises 
here: given a total rate constraint, how docs one best divide 
the bits between the two codebooks? This is an example of 
a rate allocation problem that arises in all product code- 
books and about which more will be said shortly. 

It is important to notice that the use of a product quan- 
tizer docs not mean the use of independent quantizers for 
each component. As with shape-gain VQ, the optimal lossy 
encoder will in general not view only one coordinate at a 
time. Separate and independent quantization of the com- 
ponents provides a low complexity but generally subopti- 
mal encoder. In the case of the shape-gain VQ, the op- 
timal lossy encoder is happily a simple sequential opera- 
tion, where the gain quantizer is scalar, but the selection 
of one of its quantization levels depends on the result of an- 
other quantizer, the shape quantizer. Similar ideas can be 
used for mean-removed VQ [20], [21] and mcan/gain/shapc 
VQ [392]. The most general formulation of product codes 
has been given by Chan and Gcrsho [82]. It includes a 
number of schemes with dependent quantization, cvcn.trcc- 
structurcd and multistage quantization, to be discussed 
later. 

Fischer's pyramid VQ [164] is also a kind of shapc- 
gainVQ. In this case, the codcvcctors of the shape code- 
book arc constrained to lie on the surface of a ^-dimensional 
pyramid, namely, the set of all vectors whose components 
have magnitudes summing to one. Pyramid VQ's arc very 
well suited to i.i.d. Laplacian sources. An efficient method 
for indexing the shape codcvcctors is needed and a suitable 
method is included in pyramid VQ. 

Two-dimensional shape-gain product quantizers, usu- 
ally called polar quantizers, have been extensively devel- 
oped [182], [183], [407], [406], [61], [62], [530], [489], [490], 
[483], [485], [488], [360]. Here, a two-dimensional source 
vector is represented in polar coordinates and, in the basic 
scheme, the codcbook consists of the Cartesian product of 
a nonuniform scalar codcbook for the magnitude and a uni- 
form scalar codcbook for the phase. Early versions of polar 
quantization used independent quantization of the magni- 
tude and phase information, but later versions used the 
better method described above, and some even allowed the 
phase quantizers to have a resolution that depends on the 
outcome of the magnitude quantizer. Such polar quantizers 
arc called "unrestricted" [530], [488]. High resolution anal- 
ysis can be used to study the rate-distortion performance of 
these quantizers [61], [62], [483], [485], [488], [360]. Among 
other things, such analyses find the optimal point density 
for the magnitude quantizer and the optimal bit allocation 
between magnitude and phase. Originally, methods were 
developed specifically for polar quantizers. However, re- 
cently it has been shown that Bennett's integral can be 
applied to analyze polar quantization in a straightforward 
way [380]. It turns out that for an i.i.d. Gaussian source, 
optimized conventional polar quantization gains about .41 
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dB over direct scalar quantization, and optimized unre- 
stricted polar quantization gains another .73 dB. Indeed 
the latter has, asymptotically, square cells and the optimal 
two-dimensional point density, and loses only .17 dB rela- 
tive to optimal two-dimensional vector quantization, but is 
still 3.11 dB from 6(R). 

Product quantizers can be used for any set of features 
deemed natural for decomposing a vector. Perhaps the 
most famous example is one we have seen already and now 
revisit: transform coding. 

Transform Coding 

Though the goal of this section is mainly to discuss tech- 
niques beyond scalar quantization, DPCM and transform 
coding, we discuss the latter here because of its relation- 
ships to other techniques and because wc wish to discuss 
work on the bit allocation problem. 

Traditional transform coding can be viewed as a product 
quantizer operating on the transform coefficients resulting 
from a linear transform on the original vector. Wc have al- 
ready mentioned the traditional high resolution fixed-rate 
analysis and the more recent high resolution entropy con- 
strained analysis for separate lossless coding of each quan- 
tized transform coefficient. An asymptotic low resolution 
analysis [338], [339] has also been performed. In almost 
all actual implementations, however, scalar quantizers arc 
combined with a block lossless code, where the lossless code 
is allowed to effectively operate on an entire block of quan- 
tized coefficients at once, usually by combining run-length 
coding with Huffman or arithmetic coding. As a result, the 
usual high resolution analyses arc not directly applicable. 

Although high resolution theory shows that the 
Karhuncn-Locvc transform is optimal for Gaussian sources, 
and the asymptotic low resolution analysis docs likewise, 
the dominant transform for many years has been the dis- 
crete cosine transform (DCT) used in most current image 
and video coding standards. The primary competition for 
future standards comes from discrete wavelet transforms, 
which will be considered shortly. One reason for the use 
of the DCT is its lower complexity. An "unstructured" 
transform like the Karhuncn-Locvc requires approximately 
2k operations per sample, which is small compared to the 
arithmetic complexity of unstructured VQ, but large com- 
pared to the approximately log k operations per sample for 
a DCT. Another motivation for the DCT is that in some 
sense it approximates the behavior of the Karhuncn-Locvc 
transform for certain sources. And a final motivation is 
that the frequency decomposition done by the DCT mim- 
ics, to some extent, that done by the human visual system 
and so one may quantize the DCT coefficients taking per- 
ception into account. Wc will not delve into the large liter- 
ature of transforms, but will observe that bit allocation be- 
comes an important issue, and one can cither use the high 
resolution approximations or a variety of nonasymptotic 
allocation algorithms such as the "fixed-slope" or Parcto- 
optimality considered in [526], [470], [94], [439], [438], [463]. 
The method involves operating all quantizers at points on 
their operational distortion- rate curves of equal slopes. For 



a survey of some of these methods, sec [107] or Chapter 10 
of [196]. A combinatorial optimization method is given 
in [546]. 

As a final comment on traditional transform coding, 
the code can be considered as being suboptimal as a k- 
dimcnsional quantizer because of the constrained structure 
(transform and product code). It gains, however, in having 
a low complexity, and transform codes remain among the 
most popular compression systems because of their balance 
of performance and complexity. 

Subband/wavclct/pyramid Quantization 

Subband codes, wavelet codes, and pyramid codes arc 
intimately related and all arc cousins of a transform code. 
The oldest of these methods (so far as quantization is con- 
cerned) is the pyramid code of Burt and Adclscn [66] (which 
is quite different from Fischer's pyramid VQ). The Burt 
and Adclscn pyramid is constructed from an image first 
by forming a Gaussian pyramid by successively low pass 
filtering and downsampling, and then by forming a Lapla- 
cian pyramid which replaces each layer of the Gaussian 
pyramid by a residual image formed by subtracting a pre- 
diction of that layer based on the lower resolution layers. 
The resulting pyramid of images can then be quantized, 
e.g., by scalar quantizers. The approximation for any layer 
can be reconstructed by using the inverse quantizers (re- 
production decoders) and upsampling and combining the 
reconstructed layer and all lower resolution reconstructed 
layers. Note that- as one descends the pyramid, one easily 
combines the new bits for that layer with the bits already 
used to produce a higher resolution spatially and in ampli- 
tude. The pyramid code can be viewed as one of the origi- 
nal multircsolution codes. It can be viewed as a transform 
code because the entire original structure can be viewed as 
a linear transform of the original image, but observe that 
the number of pixels has been roughly doubled. 

Subband codes decompose an image into separate im- 
ages by using a bank of linear filters, hence once again 
performing a linear transformation on the data prior to 
quantizing it. Traditional subband coding used filters of 
equal or roughly equal bandwidth. Wavelet codes can be 
viewed as subband codes of logarithmically varying band- 
widths instead of equal band widths, where the filters used 
satisfy certain properties. Since the introduction of sub- 
band codes in the late eighties and wavelet codes in the 
early 1990's, the field has blossomed and produced several 
of the major contenders for the best speech and image com- 
pression systems. The literature is beyond the scope of this 
article to survey, and much is far more concerned with the 
transforms, filters, or basis functions used and the lossless 
coding used following quantization than with the quanti- 
zation itself. Hence wc content ourselves with the mention 
of a few highlights. The interested reader is referred to the 
book by Vcttcrli and Kovaccvic on wavelets and subband 
coding [516]. 

Subband coding was introduced in the context of speech 
coding in 1976 by Crochicrc ct al. [113]. The extension 
of subband filtering from 1-D to 2-D was made by Vet- 
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tcrli [515] and 2-D subband filtering was first applied to 
image coding by Woods ct al. [541], [527], [540]. Early 
wavelet coding techniques emphasized scalar or lattice vec- 
tor quantization [12], [13], [130], [463], [14], [30], [185] and 
other vector quantization techniques have also been applied 
to wavelet coefficients, including tree encoding [366], resid- 
ual vector quantization [295], and other methods [107]. A 
major breakthrough in performance and complexity came 
with the introduction of zcrotrccs [315], [466], [457], which 
provided an extremely efficient embedded representation 
of scalar quantized wavelet coefficients, called embedded zc- 
rotrec wavelet (EZW) coding. As done by JPEG in a prim- 
itive way, the zcrotrcc approach led to a code which first 
sent bits about the transform coefficients with the largest 
magnitude, and then sent subsequent bits describing these 
significant coefficients to greater accuracy as well as bits 
about originally less significant coefficients that became sig- 
nificant as the accuracy improved. The zcrotrcc approach 
has been extended to vector quantization (e.g., [109]), but 
the slight improvement comes at a significant cost in added 
complexity. Rate-distortion ideas have been used to op- 
timize the rate-distortion tradeoffs using wavelet packets 
by minimizing a Lagrangian distortion over code trees and 
bit assignments [427]. Recently competitive schemes have 
demonstrated that separate scalar quantization of individ- 
ual subbands coupled with a sophisticated but low com- 
plexity lossless coding algorithm called stack-run coding 
can provide performance nearly as good as EZW [504]. 

The best, wavelet codes tend to use very smart lossless 
codes, lossless codes which effectively code very large vec- 
tors. While wavelet advocates may credit the decomposi- 
tion itself for the gains in compression, the theory suggests 
that rather it is the fact that vector entropy coding for very 
large vectors is feasible. 

Scalar-vector Quantization 

Like permutation vector quantization and Fischer's pyra- 
mid vector quantizer, Laroia and Farvardin's [305] scalar- 
vector quantization attempts to match the performance 
of an optimal entropy constrained scalar quantizer with 
a low complexity fixed-rate structured vector quantizer. 
A derivative technique called block constrained quantiza- 
tion [24], [27], [23], [28] is simpler and easier to describe. 
Here the reproduction codebook is a subset of the ib-fold 
product of some scalar codebook. Variable-length binary 
codewords arc associated with the scalar levels, and given 
some target rate R, the k- dimensional codebook contains 
only those sequences of k quantization levels for which the 
sum of the lengths of the binary codewords associated with 
the levels is at most kR. The minimum distortion code- 
vector can be found using dynamic programming. Alter- 
natively, an essentially optimal search can be performed 
with very low complexity using a knapsack packing or La- 
grangian approach. The output of the encoder is the se- 
quence of binary codewords corresponding to the codcvcc- 
tor that was found, plus some padded bits if the total docs 
not equal kR. The simplest method requires approximately 
20N 2 /k+20 operations per sample and storage for approxi- 



mately N 2 numbers, where N is the number of scalar quan- 
tization levels. The original scalar- vector method differs in 
that rational lengths rather than binary codewords arc as- 
signed to the scalar quantizer levels, dynamic programming 
is used to find the best codcvcctor, and the resulting code- 
vectors arc losslcssly encoded with a kind of lexicographic 
encoding. For i.i.d. Gaussian sources these methods attain 
SNR within about 2 dB of S(R) with k on the order of 100, 
which is about .5 dB from the goal of 1.53 dB larger than 
6(R). A high resolution analysis is given in [26], [23]. The 
scalar-vector method extends to sources with memory by 
combining it with transform coding using a dccorrclating 
or approximately dccorrclating transform [305]. 

Tree-Structured Quantization 

In its original and simplest form, a fc-dimcnsional tree- 
structured vector quantizer (TSVQ) [69] is a fixed-rate 
quantizer with, say, rate R whose encoding is guided by 
a balanced (fixed-depth) binary tree of depth kR. There 
is a codcvcctor associated with each of its 2 kR terminal 
nodes (leaves), and a ^-dimensional testvector associated 
with each of its 2 kR - 1 internal nodes. Quantization of a 
source vector x proceeds in a tree-structured search by find- 
ing which of the two nodes stemming from the root node 
has the closer testvector to x, then finding which of the two 
nodes stemming from this node has the closer testvector, 
and so on, until a terminal node and codcvcctor arc found. 
The binary encoding of this codcvcctor consists of the se- 
quence of kR binary decisions that lead to it. Decoding 
is done by table lookup as in unstructured VQ. As in suc- 
cessive approximation scalar quantization, TSVQ yields an 
embedded code with a naturally progressive structure. 

With this method, encoding requires storing the tree 
of testvectors and codcvcctors, demanding approximately 
twice the storage of an unstructured codebook. However, 
encoding requires only 2k R distortion calculations, which is 
a tremendous decrease over the 2** required by full search 
of an unstructured codebook. In the case of squared er- 
ror distortion, instead of storing testvectors and computing 
the distortion between x and each of them, at each internal 
node one may store the normal to the hypcrplanc bisect- 
ing the testvectors at the two nodes stemming from it, and 
determine on which side of the hypcrplanc x lies by com- 
paring an inner product of x with the normal to a threshold 
that is also stored. This reduces the arithmetic complexity 
and storage roughly in half to approximately kR operations 
per sample and 2 kR vectors. Further reductions in storage 
arc possible, as described in [252] 

The usual (but not necessarily optimal) greedy method 
for designing a balanced TSVQ [69], [225] is to first de- 
sign the testvectors stemming from the root node using 
the Lloyd algorithm on a training set. Then design the 
two testvectors stemming from, say, the left one of these 
by running the Lloyd algorithm on the training vectors that 
were mapped to the left one, and so on. 

In the scalar case, a tree can be found that implements 
any quantizer, indeed the optimal quantizer. So tree- 
structuring loses nothing, though the above design algo- 
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rithm docs not necessarily generate the best possible quan- 
tizers. In the multidimensional case, one cannot expect 
that the greedy algorithm will produce a TSVQ that is as 
good as the best unstructured VQ or even the best possible 
TSVQ. Nevertheless, it seems to work pretty well. It has 
been observed that in the high resolution case, the cells of 
the resulting TSVQs arc mostly a mixture of cubes, cubes 
cut in half, the latter cut in half again, and so on until 
smaller cubes arc formed. And it has been found for i.i.d. 
Gauss and Gauss-Markov sources that the performances 
of TSVQs with moderate to high rates designed by the 
greedy algorithm arc fairly well predicted by Bennett's in- 
tegral, assuming the point density is optimum and the cells 
arc an equal mixture of cubes, cubes cut in half and so on. 
This sort of analysis indicates that the primary weakness of 
TSVQ is in the shapes of the cells that it produces. Specif- 
ically, its loss relative to optimal k- dimensional fixed-rate 
VQ ranges from .7 dB for k = 2 to 2.2 dB for very large 
dimensions. Part of the loss is (l/\2)/M h , the ratio of the 
normalized moment of inertia of a cube to that of the best k 
dimensional cell shape, which approaches 1.53 dB for large 
k, and the remainder, about .5 to .7 dB, is due to the ob- 
longitis caused by the cubes being cut into pieces [383]. A 
paper investigating the nature of TSVQ cells is [569]. 

Our experience has been that when taking both perfor- 
mance and complexity into account, TSVQ is a very com- 
petitive VQ method. For example, wc assert that for most 
of the fast search; methods, one can find a TSVQ (with 
quite possibly a different dimension) that dominates it in 
the sense that D, R,A and M arc all at least as good. In- 
deed many of the fast search approaches use a tree struc- 
tured prcquantization. However, in TSVQ the searching 
tree and codebook arc matched in size and character in a 
way that makes them work well together. A notable ex- 
ception is the hierarchical tabic lookup VQ which attains a 
considerably smaller arithmetic complexity than attainable 
with TSVQ, at the expense of higher storage. The TSVQ 
will still be competitive in terms of throughput, however, 
as the tree-structured search is amenable to pipelining. 

TSVQs can be generalized to unbalanced trees (with 
variable depth as opposed to the fixed depth discussed 
above) [342], [94], [439], [196] and with larger branching 
factors than two or even variable branching factors [460]. 
However, it should be recalled that the goodness of the 
original TSVQ means that the gains of such arc not likely 
to be substantial except in the low resolution case or if 
variable-rate coding is used or if the source has some com- 
plex structure that the usual greedy algorithm cannot ex- 
ploit. 

A tree structured quantizer is analogous to a classifica- 
tion or regression tree, and as such unbalanced TSVQs can 
be designed by algorithms based on a gardening metaphor 
of growing and pruning. The most well known is the CART 
algorithm of Brciman, Friedman, Olshcn, and Stone [53], 
and the variation of CART for designing TSVQs bears their 
initials: the BFOS algorithm [94], [439], [196]. In this 
method, a balanced or unbalanced tree with more leaves 
than needed is first grown and then pruned. One can grow 



a balanced tree by splitting all nodes in each level of the 
tree, or by splitting one node at a time, e.g., by splitting the 
node with the largest contribution to the distortion [342] 
or in a greedy fashion to maximize the decrease in distor- 
tion for the increase in rate [439]. Once grown, the tree 
can be pruned by removing all descendants of any internal 
node, thereby making it a leaf. This will increase average 
distortion, but will also decrease the rate. Once again, one 
can select for pruning the node that offers the best tradeoff 
in terms of the least increase in distortion per decrease in 
bits. It can be shown that, for quite general measures of 
distortion, pruning can be done in an optimal fashion and 
the optimal subtrees of decreasing rate arc nested [94]. Sec 
also [355]. It seems likely that in the moderate to high rate 
case, pruning removes leaves corresponding to cells that arc 
oblong such as cubes cut in half, leaving mainly cubic cells. 
Wc also wish to emphasize that if variable-rate quantiza- 
tion is desired, the pruning can be done so as to optimize 
the tradeoff between distortion and leaf entropy. 

There has been a flurry of recent work on the theory of 
tree growing algorithms for vector quantizers, which arc a 
form of recursive partitioning. Sec for example the work 
of Nobel and Olshcn [390], [388], [389]. For other work on 
tree growing and pruning sec [393], [439], [276], [22], [355] 

Multistage Vector Quantization 

Multistage (or multistcp or cascade or residual) vector 
quantization was introduced by Juang and A H. Gray, Jr. 
[274] as a' form of tree-structured quantization with much 
reduced arithmetic complexity and storage. Instead of hav- 
ing a separate reproduction codebook for each branch in 
the tree, a single codebook could be used for all branches 
of a common length by coding the residual error accumu- 
lated to that point instead of coding the input vector di- 
rectly. In other words, the quantization error (or residual) 
from the previous stage is quantized in the usual way by 
the following stage, and a reproduction is formed by sum- 
ming the previous reproduction and the newly quantized 
residual. An example of a two-stage quantizer is depicted 
in Figure 9. The rate of the multistage quantizer is the 




Fig. 9. Two-Stage VQ 

sum of the rates of the stages, and the distortion is simply 
that of the last stage. (It is easily seen that the overall 
error is just that of the last stage.) A multistage quantizer 
has a direct sum reproduction codebook in the sense that 
it contains all codcvcctors formed by summing codcvcctors 
from the reproduction codebooks used at each stage. One 
may also view it as a kind of product code in the sense that 
the reproduction codebook is determined by the cartesian 
product of the stage codebooks. And like product quanti- 
zation, its complexities (arithmetic and storage, encoding 
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and decoding) arc the sum of those of the stage quantiz- 
ers plus a small amount for computing the residuals at the 
encoder or the sums at the decoder. In contrast a con- 
ventional single stage quantizer with the same rate and 
dimension has complexities equal to the product of those 
of the stage quantizers. 

Since the total rate is the sum of the stage rates, a bit 
allocation problem arises. In two-stage quant.zat.on us- 
ing fixed-rate, unstructured, fc-dimcnsional VQ s in both 
stages, it usually happens that choosing both stages to have 
the same rate leads to the best performance vs. complexity 
tradcofT In this case the complexities arc approximately 
the square root of what they would be for a single stage 
quantizer. . 

Though we restrict attention here to the case where all 
stages arc fixed-rate vector quantizers with the same di- 
mension, there is no reason why they need have the same 
dimension, have fixed rate, or have any similarity whatso- 
ever In other words, multistage quantization can be used 
(and often is) with very different kinds of quantizers in its 
stages (different dimensions and much different structures, 
c g DPCM or wavelet coding). For example, structuring 
the stage quantizers leads to good performance and further 
substantial reductions in complexity, e.g. [243], 179]. 

Of course, the multistage structuring leads to a subop- 
timal VQ for its given dimension. In particular the di- 
rect sum form of the codebook is not usually optimal and 
the greedy search algorithm described above, in which the 
residual from one stage is quantized by the next, docs not 
find the closest codcvcctor in the direct sum codebook. 
Moreover, the usual greedy design method, which uses a 
Lloyd algorithm to design the first stage in the usual way 
and then to design the second stage to minimize distortion 
when operating on the errors of the first, and so on, docs 
not, in general, design an optimal multistage VQ, even for 
greedy search. However, two-stage VQ's designed in this 
way work fairly well. 

A high resolution analysis of two-stage VQ us ing Ben- 
nett's integral on the second stage can be found in lollj, 
[3091. In order to apply Bennett's integral, it was necessary 
to find the form of the probability density of the quanti- 
zation error produced by the first stage. This motivated 
the asymptotic error density analysis of vector quantiza- 
tion in [312], [379]. . 

Multistage quantizers have been improved in a number 
of ways. More sophisticated (than greedy) encoding al- 
gorithms can take advantage of the direct sum nature of 
the codebook to make optimal or nearly opt.mal searches, 
though with some (and sometimes a great deal of) in- 
creased complexity. And more sophisticated design algo- 
rithms (than the greedy one) can also have benefits 132], 
T1771 f811 [311, [331. Variable-rate multistage quantizers 
Lave beer .developed [243], [297], [298], [441], [296]. 

Another way of improving multistage VQ is to adapt 
each stage to the outcome of the previous > One such 
scheme, introduced by Lee and Neuhoff [310], [309], was 
motivated by the observation that if the first stage quan- 
tizer has high rate, say R u then by Gcrsho's conjecture, the 
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first stage cells all have approximately the shape of T k the 
tcssclating polytopc with least normalized moment of in- 
ertia and the source density is approximately constant on 
them. This implies that the conditional distribution of the 
residual given that the source vector lies in the tth cell dif- 
fers from that for the jth only by a scaling and rotation, be- 
cause cell Sj differs from Si by just a scaling and rotation. 
Therefore, if first-stagc-dcpcndcnt scaling and rotation arc 
done prior to second stage quantization, the conditional 
distribution of the residual will be the same for all cells, 
and the second stage can be designed for this distnbution, 
rather than having to be a compromise, as is otherwise the 
case in two-stage VQ. Moreover since this distribution is 
essentially uniform on a support region shaped like ;Tt, the 
second stage can itself be a uniform tcssclation. The net 
effect is a quantizer that inherits the optimal point den- 
sity of the first stage 13 and the optimal cell shapes of the 
second. Therefore, in the high resolution case, this ccll- 
condiiioncd two-stage VQ works essentially as well as an 
optimal (single-stage) VQ, but with much less complexity. 

Direct implementation of cell-conditioned two-stage VQ, 
requires the storing of a scale factor and a rotation for 
each first stage cell, which operate on the first stage resid- 
ual before quantization by the second stage. Thciran- 
vcrscs arc applied subsequently. However, since the first 
stage cells arc so nearly spherical, the rotations gain only a 
small amount, typically about 0.1 dB, and may be omitted. 
Moreover, since the best known lattice tcssclations arc so 
close to the best known tcssclations, one may use lattice 
VQ as the second stage, which further reduces complexity. 
Good schemes of this sort have even been developed for 
low to moderate rates by Gibson [270], [271] and Pan and 
Fischer [403], [404]. 

Cell-conditioned two-stage quantizers can be viewed as 
having a picccwisc constant point density of the sort pro- 
posed earlier by Kuhlmann and Bucklcw [302] as a means 
of circumventing the fact that optimal vector quantizers 
cannot be implemented with companders. This approach 
was further developed by Swaszck in [487]. 

Another scheme for adapting each stage to the previ- 
ous is called codebook sharing, as introduced by Chan and 
Gcrsho [80], [82]. With this approach, each stage has a 
finite set of reproduction codebooks, one of which is used 
to quantize the residual, depending on the sequence of out- 
comes from the previous stages. Thus each codebook is 
shared among some subset of the possible sequences of out- 
comes from the previous stages. This method lies between 
conventional multistage VQ in which each stage has one 
codebook that is shared among all sequences of outcomes 
from previous stages, and TSVQ in which, in effect, a dif- 
ferent codebook is used for each sequence of outcomes from 
the previous stages. Chan and Gcrsho introduced a Lloyd- 
stylc iterative design algorithm for designing shared code- 
books; they showed that by controlling the number and 
rate of the codebooks one could optimize multistage VQ 
with a constraint on storage; and they used this method 



"Since the second stage uniformly refines the first stage cells, the 
overall point density is approximately that of the firet stage. 
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to good effect in audio coding [80]. In the larger scheme of 
things, TSVQ, multistage VQ and codebook sharing all fit 
within the broad family of generalized product codes that 
they introduced in [82]. 

Feedback Vector Quantization 

Just as with scalar quantizers, a vector quantizer can be 
predictive; simply replace scalars with vectors in the pre- 
dictive quantization structure depicted in Figure 3 [235], 
[116], [85], [417]. Alternatively, the encoder and decoder 
can share a finite set of states and a quantizer custom de- 
signed for each state. Both encoder and decoder must be 
able to track the state in the absence of channel errors, 
so that the state must be determinable from knowledge of 
an initial state combined with the binary codewords trans- 
mitted to the decoder. The result is a finite-state version 
of a predictive quantizer, referred to as a finite-state vec- 
tor quantizer and depicted in Figure 10. Although little 
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Decoder 

Fig. 10. Finite-state vector quantizer 

theory has been developed for finite-state quantizers [161], 
[178], [179], a variety of design methods exist [174], [175], 
[136], [236], [15], [16], [286], [196]. Lloyd's optimal decoder 



extends in a natural way to finite-state vector quantizers, 
the optimal reproduction decoder is a conditional expec- 
tation of the input vector given the binary codeword and 
the state. The optimal lossy encoder is not easily described, 
however, as the next state must be chosen in a way that en- 
sures good future behavior, and not just in a greedy fashion 
that minimizes the current squared error. If look-ahead is 
allowed, however, then a tree or trellis search can be used 
to pick a long-term minimum distortion path, as will be 
considered in the next subsection. 

Both predictive and finite-state vector quantizers typi- 
cally use memory in the lossy encoder, but use a memory- 
less lossless code independently applied to each successive 
binary codeword. One can of course also make the lossless 
code depend on the state, or be conditional on the previous 
binary codeword. One can also use a memory less VQ com- 
bined with a conditional lossless code (conditioned on the 
previous binary codeword) designed with a conditional en- 
tropy constraint [95], [188]. A simple approach that works 
for TSVQ is to code the binary path to the codcvcctor for 
the present source vector relative to the binary path to that 
of the previous source vector, which is usually very simi- 
lar. This is a kind of interblock lossless coding [384], [410], 
[428]. 

Address-vector quantization, introduced by Nasrabadi 
, and Feng [371] (sec also [160], [373]), is another way to in- 
troduce memory into the lossy encoder of a vector quantizer 
with the goal of attaining higher dimensional performance 
with lower dimensional complexity. With this approach, in 
addition to the usual reproduction codebook C , there is an 
address codebook C a containing permissible sequences of 
indices of codcvcctors in C. The address codebook plays 
the same role as the outer code in a concatenated channel 
code (or the trellis in trellis encoded quantization discussed 
below), namely, it limits the allowable sequences of code- 
words from the inner code, which in this case is C. In 
this way, address-vector quantization can exploit the prop- 
erty that certain sequences of codcvcctors arc much more 
probable than others; these will be the ones contained in 

c a . 

As with DPCM, the introduction of memory into the 
lossy encoder seriously complicates the theory of such 
codes, which likely explains why there is so little. 

Tree/Trellis Encoded Quantization 

Channel coding has often inspired source coding or quan- 
tization structures. Channel coding matured much earlier 
and the dual nature of channel and source coding suggests 
that a good channel code can be turned into a good source 
code by reversing the order of encoder and decoder. This 
role reversal was natural for the codes which cased search 
requirements by imposition of a tree or trellis structure. 
Unlike the tree-structured vector quantizers, these earlier 
systems imposed the tree structure on the sequence of sym- 
bols instead of on a single vector of symbols. For the chan- 
nel coding case, the encoder was a convolutional code, in- 
put symbols shifted into a shift register as output symbols, 
formed by linear combinations (in some field) of the shift 
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register contents, shifted out. Sequences of output sym- 
bols produced in this fashion could be depicted with a tree 
structure, where each node of the tree corresponded to the 
state of the shift register (all but the final or oldest sym- 
bol) and the branches connecting nodes were determined 
by the most recent symbol to enter the shift register and 
were labeled by the corresponding output, the output sym- 
bol resulting if that branch is taken. The goal of a channel 
decoder is to take such a sequence of tree branch labels 
that has been corrupted by noise, and find a minimum 
distance valid sequence of branch labels. This could be ac- 
complished by a tree-search algorithm such as the Fano, 
stack, or M-algorithm. Since the shift register is finite, the 
tree becomes redundant and new nodes will correspond to 
previously seen states so that the tree diagram becomes a 
merged tree or trellis, which can be searched by a dynamic 
programming algorithm, the Vitcrbi algorithm, cf [173]. In 
the early 1970's the algorithms for tree decoding channel 
codes were inverted to form tree-encoding algorithms for 
sources by Jclinck, Anderson, and others [268], [269], [11], 
[132], [123], [10] Later trellis channel decoding algorithms 
were modified to trellis-encoding algorithms for sources by 
Vitcrbi and Omura [519]. While linear encoders sufficed 
for channel coding, nonlinear decoders were required for 
the source coding application, and a variety of design al- 
gorithms were developed for designing the decoder to pop- 
ulate the trellis searched by the encoder [319], [531], [481], 
[18], [40]. Observe that the reproduction decoder of a finite- 
state VQ can be used as the decoder in a trellis-encoding 
system, where the finite-state encoder is replaced by a min- 
imum distortion search of the decoder trellis implied by the 
finite-state VQ decoder, which is an optimal encoding for 
a sequence of inputs. 

Tree and trellis encoded quantizers can both be consid- 
ered as a VQ with large blocklcngth and a reproduction 
codebook constrained to be the possible outputs of a non- 
linear filter or a finite-state quantizer or vector quantizer 
of smaller dimension. Both structures produce long code- 
words with a trellis structure, i.e., successive reproduction 
symbols label the branches of a trellis and the encoder is 
just a minimum distortion trellis search algorithm such as 
the Vitcrbi algorithm. 

Trellis Coded Quantization 

Trellis coded quantization, both scalar and vector, im- 
proves upon traditional trellis encoded systems by label- 
ing the trellis branches with entire subcodebooks (or "sub- 
sets" ) rather than with individual reproduction levels [345] , 
[344], [166], [167], [522], [343], [478], [514]. The primary 
gain resulting is a reduction in encoder complexity for a 
given level of performance. As the original trellis encod- 
ing systems were motivated by convolutional channel codes 
with Vitcrbi decoders, trellis coded quantization was moti- 
vated by Ungcrbocck's enormously successful coded mod- 
ulation approach to channel coding for narrowband chan- 
nels [505], [506]. 

Recent combinations of TCQ to coding wavelet coeffi- 
cients [478] have yielded excellent performance in image 



coding applications, winning the JPEG 2000 contest of 
1997 and thereby a position as a serious contender for the 
new standard. 

Gaussian Quantizers 

Shannon [465] showed that a Gaussian i.i.d. source 
had the worst rate-distortion function of any i.i.d. source 
with the same variance, thereby showing that the Gaussian 
source was an cxtrcmum in a source coding sense. It was 
long assumed and eventually proved by Sakrison in 1975 
[456] that this provided a robust approach to quantization 
in the sense there exist vector quantizers designed for the 
i.i.d. Gaussian source with a given average distortion which 
will provide no worse distortion when applied to any i.i.d. 
source with the same variance. This provided an approach 
to robust vector quantization, having a code that might not 
be optimal for the actual source, but which would perform 
no worse than it would on the Gaussian source for which 
it was designed. 

Sakrison extended the extremal properties of the rate 
distortion functions to sources with memory [453], [454], 
[455] and Lapidoth [306] (1997) showed that a code de- 
signed for a Gaussian source would yield essentially the 
same performance when applied to another process with 
the same covariancc structure. 

These results arc essentially Shannon theory and hence 
should be viewed as primarily of interest for high- 
dimensional quantizers. 

In a different approach towards using a Gaussian quan- 
tizer on an arbitrary source, Popat and Zcgcr (1992) took 
advantage of the central limit theorem and the known 
structure of an optimal scalar quantizer for a Gaussian ran- 
dom variable to code a general process by first filtering it to 
produce an approximately Gaussian density, scalar quan- 
tizing the result, and then inverse filtering to recover the 
original [419]. 

C. Robust Quantization 

The Gaussian quantizers were described as being robust 
in a minimax average sense: a vector quantizer suitably 
designed for a Gaussian source will yield no worse average 
distortion for any source in the class of all sources with 
the same second order properties. An alternative formula- 
tion of robust quantization is obtained if instead of dealing 
with average distortion, as is done in most of this paper, 
one places a maximum distortion requirement on quan- 
tizer design. Here a quantizer is considered to be robust 
if it bounds the maximum distortion for a class of sources. 
Morris and Vandclindc (1974) [361] developed the theory 
of robust quantization and provide conditions under which 
the uniform quantizer is optimum in this minimax sense. 
This can be viewed as a variation on cpsilon entropy since 
the goal is to minimize the maximum distortion. Further 
results along this line may be found in [37], [275], [491]. 
Because these arc minimax results aimed at scalar quanti- 
zation, these results apply to any rate or dimension. 
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D. Universal Quantization 



The minimax approaches provide one means of designing 
a fixed rate quantizer for a source with unknown or par- 
tially known statistics: a quantizer can be designed that 
will perform no worse than a fixed value of distortion for all 
sources in some collection. An alternative approach is to be 
more greedy and try to design a code that yields nearly op- 
timal performance regardless of which source within some 
collection is actually coded. This is the idea behind uni- 
versal quantization. 

Universal quantization or universal source coding had 
its origins in an approach to universal lossless compression 
developed by Rice and Plaunt [435], [436] and dubbed the 
"Rice machine." Their idea was to have a lossless coder 
that would work well for distinct sources by running mul- 
tiple lossless codes in parallel and choosing the one pro- 
ducing the fewest bits for a period of time, sending a small 
amount of overhead to inform the decoder which code the 
encoder was using. The classic work on lossy universal 
source codes was Ziv's 1972 paper [577], which proved the 
existence of fixed-rate universal lossy codes under certain 
assumptions on the source statistics and the source and 
codebook alphabets. The multiple codebook idea was also 
used in 1974 [221] to extend the Shannon source coding 
theorem to noncrgodic stationary sources by using the cr- 
godic decomposition to interpret a noncrgodic source as 
a universal coding problem for a family of crgodic sources. 
The idea is easily described and provides one means of con- 
structing universal codes. Suppose that one has a collec- 
tion of Jb-dimcnsional codebooks C k with 2 kRk codcvcctors, 
k = 1, . . K, each designed for a different type of local 
behavior. ' For example, one might have different code- 
books in an image coder for edges, textures, and gradi- 
ents. The union codebook ULi C * thcn contain^ all the 
codcvcctors in all of the codes, for a total of Y^k=\ 2kRk 
codcvcctors. Thus, for example, if all of the subcodebooks 
C k have equal rate Rk = then the rate of the universal 
code is R + AT 1 log bits per symbol, which can be small 
if the dimension k is moderately large. This docs not mean 
that it is necessary to use a large dimensional VQ, since 
the VQ can be a product VQ, e.g., for an image one could 
have k = 64 by coding each square of dimension 8 x 8 = 64 
using four applications of a VQ of dimension 4 x 4 = 16. 
If one had, say, 4 different codes, the resulting rate would 
be 2/64 = R + 0.031, which would be a small increase 
over the original rate if the original rate is, say, .25. 

A universal code is in theory more complicated than 
an ordinary code, but in practice it can mean codes with 
smaller dimension might be more efficient since separate 
codebooks can be used for distinct short term behavior. 

Subsequently a variety of notions of fixed-rate uni- 
versal codes were considered and compared [382], and 
fixed-distortion codes with variable-rate were developed by 
Mackcnthun and Purslcy [340] and Kicffer [277], [279]. 

As with the early development of block source codes, 
universal quantization during its early days in the 1970s 
was viewed as more of a method for developing the theory 



than as a practical code design algorithm. The Rice ma- 
chine, however, proved the practicality and importance of 
a simple multiple codebook scheme for handling composite 
sources. 

These works all assumed the encoder and decoder to pos- 
sess copies of the codebooks being used. Zcgcr, Bist, and 
Lindcr [566] considered systems where the codebooks arc 
designed at the encoder, but must be also coded and trans- 
mitted to the decoder, as is commonly done in codebook 
replenishment [206]. 

A good review of the history of universal source cod- 
ing through the early 1990s may be found in Kicffer 
(1993) [283]. 

Better performance tradeoffs can be achieved by allow- 
ing both rate and distortion to vary, and in 1996 Chou 
ct al. [92] formulated the universal coding problem as an 
entropy-constrained vector quantization problem for a fam- 
ily of sources and provided existence proofs and Lloyd-style 
design algorithms for the collection of codebooks subject 
to a Lagrangian distortion measure, yielding a fixed rate- 
distortion slope optimization rather than fixed distortion or 
fixed rate. The clustering of codebooks was originally due 
to Chou [90] in 1991. High resolution quantization theory 
was used to study rates of convergence with blocklcngth to 
the optimal performance, yielding results consistent with 
earlier convergence results developed by other means, e.g., 
Lindcr ct al. [321]. The fixed-slope universal quantizer ap- 
proach was further developed with other code structures 
and design algorithms by Yang ct al. [558]. 

A different approach which more closely resembles tradi- 
tional adaptive and codebook replenishment was developed 
by Zhang, Yang, Wei, and Liu [329], [575], [574]. Their ap- 
proach, dubbed "gold washing" did not involve training, 
but rather created and removed codcvcctors according to 
the data received and an auxiliary random process in a way 
that could be tracked by a decoder without side informa- 
tion. 



E. Dithering 

Dithered quantization was introduced by Roberts [442] 
in 1962 as a means of randomizing the effects of uniform 
quantization so as to minimize visual artifacts. It was fur- 
ther developed for images by Limb (1969) [317] and for 
speech by Jayant and Rabincr (1972) [266]. Intuitively, 
the goal was to cause the reconstruction error to look more 
like signal-independent additive white noise. It turns out 
that for one type of dithering, this intuition is true. In a 
dithered quantizer, instead of quantizing an input signal 
X n directly, one quantizes a signal U n = X n + W n , where 
W n is a random process, independent of the signal X„, 
called a dither process. The dither process is usually as- 
sumed to be i.i.d. . There arc two approaches to dithering. 
Roberts considered subtractiyc dithering, where the final 
reconstruction is formed as X = q(X n + W n ) - Wi- An 
obvious problem is the need for the decoder to possess a 
copy of the dither signal. Nonsubtractivc dithering forms 
the reproduction as X = q{X n + W n )- 

The principal theoretical property of nonsubtractivc 
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dithering was developed by Schuchman [461], who showed 
that the quantizer error c„ = X n — X n = X n — q(X n + 
W n ) 4- W n is uniformly distributed on (-A/2, A/2] and is 
independent of the original input signal X n if and only if 
the quantizer docs not overload and the characteristic func- 
tion M w (jit) = E [c^ w ] satisfies M w (&±) = 0; / ^ 0. 
Schuchman's conditions arc satisfied, for example, if the 
dither signal has a uniform probability density function on 
(—A/2, A/2]. It follows from the work of Jayant and Ra- 
bincr [266] and Sripad and Snyder [477] (sec also [216]) 
that Schuchrnan *s condition implies that the sequence of 
quantization errors {c n } is independent. The case of uni- 
form dither remains by far the most widely studied in the 
literature. 

The subtractivc dither result is nice mathematically be- 
cause it promises a well behaved quantization noise as well 
as quantization error. It is impractical in many applica- 
tions, however, for two reasons. First, the receiver will 
usually not have a perfect analog link to the transmitter 
(or else the original signal could be sent in analog form) 
and hence a pseudo-random deterministic sequence must 
be used at both transmitter and receiver as proposed by 
Roberts. In this case, however, there will be no math- 
ematical guarantee that the quantization error and noise 
have the properties which hold for genuinely random i.i.d. 
dither. Second, subtractivc dither of a signal that indeed 
resembles a sample function of a mcmorylcss random pro- 
cess is complicated to implement, requiring storage of the 
dither signal, high precision arithmetic, and perfect syn- 
chronization. As a result, it is of interest to study the be- 
havior of the quantization noise in a simple nonsubtractivc 
dithered quantizer. Unlike subtractivc dither, nonsubtrac- 
tivc dither is not capable of making the reconstruction error 
independent of the input signal (although claims to the con- 
trary have been made in the literature). Proper choice of 
dithering function can, however, make the conditional mo- 
ments of the reproduction error independent of the input 
signal. This can be practically important. For example, it 
can make the perceived quantization noise energy constant 
as an input signal fades from high intensity to low inten- 
sity, where otherwise it can (and docs) exhibit strongly 
signal-dependent behavior. The properties of nonsubtrac- 
tivc dither were originally developed in unpublished work 
by Wright [542] in 1979 and Brinton [54] in 1984 and subse- 
quently extended and refined with a variety of proofs [513], 
[512], [328], [227]. For any A: = 1,2,... necessary and suf- 
ficient conditions on the characteristic function My/ arc 
known which ensure that the fcth moment of the quantiza- 
tion noise c n = q(X n ■+■ W n ) — X n conditional on X n docs 
not depend on X n . A sufficient condition is that the dither 
signal consists of the sum of k independent uniformly dis- 
tributed random variables on [— A/2, A/2]. Unfortunately 
this conditional independence of moments comes at the ex- 
pense of a loss of fidelity. For example, if k = 2 then the 
quantizer noise power (the mean squared error) will be 

E[c 2 \X] = E[c 2 ] = E[W 2 ]+^. 



This means that the power in the dither signal is directly 
added to that of the quantizer error in order to form the 
overall mean squared error. 

In addition to its role in whitening quantization noise 
and making the noise or its moments independent of the 
input, dithering has played a role in proofs of "universal 
quantization" results in information theory. For example, 
Ziv [578] showed that even without high resolution theory, 
uniform scalar quantization combined with dithering and 
vector lossless coding could yield performance within .75 
bits/symbol of the rate-distortion function. Extensions to 
lattice quantization and variations of this result have been 
developed by Zamir and Fcdcr [565] 

F. Quantization for Noisy Channels 

The separation theorem of information theory [464], 
[180] states that nearly optimal communication of an in- 
formation source over a noisy channel can be accomplished 
by separately quantizing or source coding the source and 
channel coding or error-control coding the resulting en- 
coded source for reliable transmission over a noisy chan- 
nel. Moreover, these two coding functions can be designed 
separately, without knowledge of each other. The result is 
only for point-to-point communications, however, and it is 
a limiting result in the sense that large block lengths and 
hence large complexity must be permitted. If one wishes 
to perform near the Shannon limit for moderate delay or 
blocklcngths, or in multiuser situations, it is necessary to 
consider joint source and channel codes, codes which jointly 
consider quantization and reliable communication. It may 
not actually be necessary to combine the source and chan- 
nel codes, but simply to jointly design them. There arc a 
variety of code structures and design methods that have 
been considered for this purpose, many of which involve 
issues of channel coding which arc well beyond the focus 
of this paper. Here we mention only schemes which can 
be viewed as quantizers which arc modified for use on a 
noisy channel and not those schemes which involve explicit 
channel codes. More general discussions can be found, e.g., 
in [122]. 

One approach to designing quantizers for use on noisy 
channels is to replace the distortion measure with respect 
to which a quantizer is optimized by the expected distortion 
over the noisy channel. This simple modification of the dis- 
tortion measure allows the channel statistics to be included 
in an optimal quantizer design formulation. Recently the 
method has been referred to as "channel-optimized quan- 
tization," where the quantization might be scalar, vector, 
or trellis. 

This approach was introduced in 1969 by Kurtcnbach 
and Wintz [304] for scalar quantizers. A Shannon source 
coding theorem for trellis encoders using this distortion 
measure was proved in 1981 [135] and a Lloyd-style design 
algorithm for such encoders provided in 1987 [19]. A Lloyd 
algorithm for vector quantizers using the modified distor- 
tion measure was introduced in 1984 by Kumazawa, Kasa- 
hara, and Namckawa [303] and further studied in [157], 
[152], [153]. The method has also been applied to tree- 
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structured VQ [412]. It can be combined with a maxi- 
mum likelihood detector to further improve performance 
and permit progressive transmission over a noisy chan- 
nel [411], [523]. Simulated annealing has also been used 
to design such quantizers [140], [152], [354]. 

Another approach to joint-source and channel coding 
based on a quantizer structure and not explicitly involv- 
ing typical channel coding techniques is to design a scalar 
or vector quantizer for the source without regard to the 
channel, but then code the resulting indices in a way 
that ensures that small (large) Hamming distance of the 
channel codewords corresponds to small (large) distortion 
between the resulting reproduction codewords, essentially 
forcing the topology on the channel codewords to corre- 
spond to that of the resulting reproduction codewords. 
The codes that do this arc often called index assignments. 
Several specific index assignment methods were considered 
by Rydbcck and Sundbcrg [448]. DcMarca and Jayant , in 
1987 [121] introduced an iterative search algorithm for de- 
signing index assignments for scalar quantizers, which was 
extended to vector quantization by Zcgcr and Gcnho [568], 
who dubbed the approach W^!*** °g» 
index assignment algorithms include [210], [543], [287j. 
For binary symmetric channels and certain special source* 
and quantizers, analytical results have been obtained [555] , 
[556] [250], [501], [112], [351], [42], [232], 233], [352]. For 
example, it was shown by Crimmins ct al. in 1969 [112] 
that the index assignment that minimizes mean squared 
error for a uniform scalar quantizer used on a binary sym- 
metric channel is the natural binary assignment However 
this result remained relatively unknown until rcdcrivcd and 
generalized in [351]. 

When source and channel codes arc considered together, 
a key issue is the determination of the quantization rate 
to be used when the total of number of channel symbols 
per source symbol is held fixed. For example, as quantiza- 
tion rate is increased, the quantization noise decreases but 
channel induced noise increases because the ability of the 
channel code to protect the bits is reduced. Clearly, there 
is an optimal choice of quantization rate. Another issue is 
the determination of the rate at which overall distortion de- 
creases in an optimal system as the total number of channel 
uses per source symbol increases. These issues have been 
addressed in recent papers by Zcgcr and Manzclla [570 and 
Hochwald and Zcgcr [244], which use both exponential for- 
mulas produced by high resolution quantization theory and 
exponential bounds to channel coding error probability. 

There arc a variety of other approaches to joint source 
and channel coding, including the use of codes with a chan- 
nel encoder structure optimized for the source or with a 
special decoder matched to the source, using unequal er- 
ror protection for to better protect more important (lower 
resolution) reproduction indexes, jointly optimized combi- 
nations of source and channel codes, and combinations of 
channel-optimized quantizers with source-optimized chan- 
nel codes, but we leave these to the literature as they in- 
volve a heavy dose of channel coding ideas. 



G. Quantizing Noisy Sources 

A parallel problem to quantizing for a noisy channel is 
quantizing for a noisy source. The problem can be seen 
as trying to compress a dirty source into a clean reproduc- 
tion, or as doing estimation of the original source based 
on a quantized version of a noise-corrupted version. If the 
underlying statistics arc known or can be estimated by a 
training sequence, then this can be treated as a quanti- 
zation problem with a modified distortion measure, where 
now the distortion between a noise corrupted observation 
Y = y of an unseen original X and a reconstruction x based 
on the encoded and decoded y is given as the conditional 
expectation E[d(X,x)\Y = y]. The usefulness of this modi- 
fied distortion for source coding noisy sources was first seen 
by Dobrushin and Tsybakov (1962) [134] and was used by 
Fine (1965) [162] and Sakrison (1968) [452] to obtain infor- 
mation theoretic bounds an quantization and source cod- 
ing for noisy sources. Bcrgcr (1971) [46] explicitly used the 
modified distortion in his study of Shannon source coding 
theorems for noise corrupted sources. 

In 1970 Wolf and Ziv [537] used the modified distortion 
measure for a squarcd-crror distortion to prove that the 
optimal quantizer for the modified distortion could be de- 
composed into the cascade of a minimum mean-squared 
error estimator followed by an optimal quantizer for the 
estimated original source. This result was subsequently 
extended to a more general class of distortion measures in- 
clude the input-weighted quadratic distortion of Ephraim 
and Gray [145], where a generalized Lloyd algorithm tor 
design was presented. 

Related results and approaches can be found in Witscn- 
hauscn's (1980) [535] treatment of rate distortion theory 
with modified (or "indirect") distortion measures, and in 
the Occam filters of Natarajan (1995) [370]. 

H. Multiple Description Quantization 

A topic closely related to quantization for noisy channels 
is multiple description quantization. The problem is usu- 
ally formulated as a source coding or quantization problem 
over a network, but it is most easily described in terms of 
packet communications. In the simplest case, suppose that 
two packets of information, each of rate R, arc transmit- 
ted to describe a reproduction of a single random vector 
X The encoder might receive one or the other packet 
or the two together and wishes to provide the best recon- 
struction possible for the bit rate it receives. This can be 
viewed as a network problem with one receiver seeing only 
one channel, another receiver seeing the second channel, 
and a third rccicvcr seeing both channels, and the goal is 
that each have an optimal reconstruction for the total re- 
ceived bitratc. Clearly one can do no better than having 
each packet alone result in in a reproduction with i distor- 
tion near the Shannon distortion-rate function D{R) i while 
simultaneously having the two packets together yield a re- 
production with distortion near D{2R), but this optimistic 
performance is in general not possible. This problem was 
first tackled in the information theory community in 1980 
by Wolf, Wyncr, and Ziv [536] and Ozarow [401] who dc- 
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vclopcd achievable rate regions and lower bounds to perfor- 
mance. The results were extended by El Gamal and Cover 
(1982) [139], Ahlswcdc (1985) [6], and Zhang and Bcrgcr 
(1987) [573]. 

In 1993 Vaishampayan ct al. used a Lloyd algo- 
rithm to actually design fixed rate [508] and entropy- 
constrained [509] scalar quantizers for the multiple descrip- 
tion problem. High resolution quantization ideas were used 
to evaluate achievable performance in 1998 by Vaisham- 
payan and Batllo [510] and Lindcr, Zamir, and Zcgcr [324]. 
An alternative approach to multiple description quantiza- 
tion using transform coding has also been considered, e.g., 
in [38], [211]. 

/. Other Applications 

We have not treated many interesting variations and 
applications of quantization, several of which have been 
successfully analyzed or designed using the tools described 
here. Examples which we would have included had time, 
space, and patience been more plentiful include mismatch 
results for quantizers designed for one distribution and ap- 
plied to another, quantizers designed to provide inputs to 
classification, detection, or estimation systems, quantiz- 
ers in multiuser systems such as simple networks, quan- 
tizers implicit in finite-precision arithmetic (the modern 
form of roundoff error), and quantization in noise-shaping 
analog-to-digital and digital-to-analog converters such as 
All-modulators. Doubtless we have failed to mention a 
few, but this list suffices to demonstrate how rich the the- 
oretical and applied fields of quantization have become in 
their half century of active development. 
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